Marketing News – S2E1 Recap of Google Office Hours with John Mueller from 12-31-2021, Google not indexing my site, translations, and hidden text is ok? January 3rd, 2022

January 3, 2022

There wasn’t a lot of SEO news over the holiday weekend and the new year, but I wanted to recap what I felt were the most important takeaways from Friday’s English Office hours with Google’s John Mueller. As always, a synopsis, my opinion, and a link to the question in the video.

Some people may rely on the cache pages in search result pages to verify the indexing of the content correctly. The cache pages are handled separately from the index, so you may have a page in the index that you don’t have a cache page for.

I really like the cache pages, they give you a good idea of the actual content that Google sees, especially if you do client side rendering.
https://youtu.be/MrgtKt4u8nk?t=334

Poor translation automated translation of an alternate language, how does this affect the SEO’s main language version. This comes down to metrics that Google uses that are sitewide factors vs single page factors. If you have low quality pages, whether that’s thin content, badly translated content, or another issue, this can bring down overall site / domain quality rankings and be harmful to your SEO efforts. The discussion goes to conclude that at the least, spend the money on your top pages for quality translation.

Google crawler ignores all permission requests from your site. This came up in a question that was related to IP based content detection that was causing their site to skew towards a specific geographic area for indexing. As they’re moving away from that style and coming up with solutions to allow you the end user to sort, what ways is GoogleBot interacting with a form or button that would then cause the same kinds of indexing issues. John Mueller goes on to say that if it’s a form or button, GoogleBot is not going to click or fill out anything, but if it is a link then it will most definitely try the link. If the site uses your location and there is a permission request, GoogleBot will ignore this request. He said it ignores all requests.

If you have pages that are authoritative, have good backlinks, but you don’t want them indexed, consider using a rel canonical tag to flow that authority to a different page. The question was asking about using a redirect for just GoogleBot, which is very borderline against Google guidelines. But these pages aren’t indexed so it’s actually ok.

My opinion is that anytime you are treating a user differently than Google, you are making things a lot harder, and it’s also not easy to test and duplicate redirects for a single crawler and you can cause other issues down the line.

This question is really relevant to work I’ve been doing recently where we saw over 150k pages removed from the index within about a week as Google picked up a no result message that we had in the source code but was hidden. If you have any text in your source code, whether hidden or not, Google CAN index it and use it. If it happens to be an error message, then this actually can trigger Google to think the whole page is an error page and then ignore it moving forward. If you have content you don’t want indexed, don’t have it in the source code of the page. Also, consider text that’s hidden is indexable. If you have a shortened version of product text that only a portion is visible, and the rest is hidden, you aren’t going to get a cloaking penalty but you get the benefit of that content being indexed and helping your SEO

Using the indexing API to get your results indexed faster than letting Google find it via a sitemap. John says that yes, it will crawl the page so it can review the content and the structured data associated with jobs. You are limited by 200 URLs and the process to increase this is not very straightforward. I can tell you that I have successfully submitted over 20k urls in a single day and have seen them in the index within 24 hours, and these are not job pages. There’s been a few articles around and Yoast for WordPress started implementing this in their plugin back in February of 2019.

Live indexing for Bing & Google coming to Yoast SEO

Discussion about title rewrites that don’t make sense: Seems like it is still happening and can definitely be an issue if you are adding new types of content or products and Google hasn’t fully aligned that with your domain.

This is the big one, my site isn’t getting indexed. How long does Google take to pick up on quality changes? You are looking at several months. Keep that in mind as you build up pages and make them more robust, that your overall site quality score will take a long time for Google to fully update and give you additional authority. This is where in my opinion it makes a lot of sense to create a dev environment for these types of changes and launch a lot of the update at once sending a clear signal to Google. John mentions to make sure you do the full thing, don’t do small changes and wait to see how Google reacts as it’s a long timeframe.

Just a closing note, Screaming Frog updated their software to ver 16.5 which closed another log4j vulnerability that showed up, however a new one surfaced again, and they responded to my tweet to see what their timeline is for a fix. They can provide one via an email to their support team. Once again, Screaming Frog has just absolutely amazing customer service.

@screamingfrog Will you guys be updating the latest version of Screaming Frog to log4js 2.17.1 soon? The current version includes 2.17.0 that has a vulnerability – https://t.co/3xkWiQxSIq My IT department keeps making me uninstall and I can't live without my Screaming Frog!
— Philip Mastroianni (@WebPhilM) January 3, 2022

Transcript:

Welcome to the opinionated SEO, where we talk about recent news and updates in the digital marketing world of SEO paid advertising and social media that impact you as a marketer. Also throw some of my opinion into the mix. Now there wasn’t a lot of SEO news over the holiday weekend in the new year, but I wanted to recap what I felt were the most important takeaways from Friday’s English office hours with Google’s John Mueller, as always I’ll give a synopsis, my opinion, and a link to the question in the video.

Some people may be relying on the cash pages in search result pages to verify the indexing of their content on the page correctly. Now the cash pages are actually handled separately from indexing. So you may have a page that’s in the index. However, you won’t have a cached page for it. Now, my opinion, I really like looking at cached pages. I think they give you a great idea of the actual content that Google sees, especially if you’re doing client site rendering. So make sure to take a look at that. See if there’s anything in there that you weren’t expecting to see something showing up that should have been maybe hidden or you weren’t expecting it to be able to pull that from say javaScript or an Ajax request..

The next question was really interesting because I’m going through a lot of this right now and it relates to translations. So the discussion was talking about poor translations from automated translations of an alternate language and how that affects the SEO’s main language version.

And this actually comes down to metrics that Google uses that are site-wide factors for single page factor. So if you have low quality pages overall, whether it’s, thin content. Badly translated content or another issue. This can actually bring down your overall site and domain quality rankings and be harmful to your SEO efforts.

As a whole, the discussion goes in to conclude that at the least spend money on your top pages for quality translate. I’ve seen this a lot where automatic translation was set up. And unfortunately now you’ve just doubled the size of your site and all of those pages are kind of garbage and that just overall brings the quality of your domain down and can affect your overall ranking.

So only translate things that you need to, or if they’re just for users and you don’t need it to be. Remove it from the index, no index those pages. And so you don’t have to worry about Google seeing those and saying that they’re low quality because it won’t even look at them.

Another interesting question that came up, talked about IP based content detection. What I think the big takeaway here is that Google crawler ignores all permission requests from your site. So when it asks for, can I have your location to find maybe, the nearest store or products that are in your area, things like that.

It’s not something that Google bot is going to be able to do. So this was a question that came up because. Presenting content that was based on IP detection and it was causing their site to skew towards a very specific geographic area for indexing. And as they’re moving away from that, to make sure that their national brand actually shows up for all the products that are available across the board they are trying to come up with a better way to implement that. And worried that Google bot might still interact with some of their location-based settings that will affect it the way it was before. So John Mueller went in to say that, if it’s a form or a button Google, bots not going to click or fill anything out, but if it’s a link, then it will most definitely try the link, and if the site uses your location and there’s a permission request, Google bot will ignore it because it’s ignoring all requests. So you should be fine. So this one’s really interesting. I would say to test, test test. What would make the most sense here is try at least two different tests one that’s IP based and one that’s permission-based and see how Google handles them differently and see if this is something that your implementation works the way that you expect it to.

There’s a great question about pages that are authoritative have good back links, but for whatever reason, maybe you don’t want them to be indexed. And how do you handle those? John came with a great suggestion of using the REL canonical tag to flow the authority to a different page.

The question was really asking about, can I use a redirect just for Google bot, which really has borderline against the Google guidelines, the pages aren’t indexed it’s actually. Okay. But if those were indexed pages, that would definitely be against guidelines. So my opinion here is anytime you’re treating a user differently than Google. You’re making things a lot harder. And it’s also not easy to test and duplicate redirects for a single crawler. And you can cause other issues down the line. I would say if at all possible, always try to give Google bot the exact same representation as any user. So you never run into those issues. And beyond that, REL canonical is an amazing tag and it does a great job. And I haven’t seen a lot of issues with.

The next question that really stood out to me is very relevant to work I’m doing because recently we saw about 150,000 pages removed from the index within about a week as Google picked up a no result message that we had in the source code. Even though it was hidden. If you have any text in your source code, whether hidden or not Google can index it and use it.

If it happens to be an error message, and this is actually can trigger Google to think that the whole page has an error page and then ignore it. And moving forward, if you have content, you don’t want indexed, don’t have it in the source code of the page. Also consider texts. That’s hidden is indexable. If you have a shortened version of the product text that only a portion is visible and the rest is hidden, you aren’t going to get a cloaking penalty, but you get the benefit of that content being indexed and helping your SEO.

Back in early 2019, there was a lot of discussion about using the indexing API to get your results index faster than just letting Google know via site map. John Mueller said that, yes, it will crawl the page so it can review the content and the structure data associated. But with jobs you’re typically limited by 200 URLs.

When you start out in the process to increase that I’ll tell you right now is not a very straightforward, however, I’ve successfully submitted over 20,000 URLs in a single day and have seen them in the index within 24 hours. And these are not job-related pages. So it is possible . There are a few articles around and Yoast for WordPress started implementing this in their plugin back in February of 2019, I’ve got a link to their blog post about that, and some of their tests that were done..

This was a hot topic about a month or two ago, but there was a discussion about title rewrites that just don’t make sense. And it seems like for this person, it was happening and can definitely be an issue for you if you’re adding new types of content or product lines and Google hasn’t fully aligned that with your domain.

So take a look at this. Who’s a little bit of a short discussion, but something to consider, if you’re going to be making a major change or adding a whole new product segment that your titles might be a little bit off. Something to maybe consider as you build that section out.

This next one’s a big one people saying my site isn’t getting index. How long does Google take to pick up on quality changes? You’re looking at several months. Keep that in mind as you build up pages and make them more robust that your overall site quality score will take a long time for Google to fully update and give you additional authority.

So if you feel like your site, isn’t getting index because it’s a lower quality And you need to increase that quality by building up better content, better internal linking things like that. It’s going to take a while for Google to understand that. So my opinion is to make a lot of these changes in a dev environment, and then push them all in a single update, which is going to send a very clear signal to Google.

John mentioned to make sure you do the full thing. Don’t do small changes. Wait a couple months, see how it reacts and then go back and make more small changes. Those incremental changes though, are positive. Aren’t going to have the same kind of impact as doing a large change overall, which sends that signal.

. And last thing on the list here, screaming frog continues to make updates to their software, their version 16.5 closed another log 4jJ vulnerability that showed up, however, another one surfaced and they keep surfacing.

This is a third-party library that they’re using. So it’s not something that’s their fault, but they did respond to my tweet, asking what their timeline was for a fixed and basically go ahead and email support. They could send you a patched version. This once again, shows how screaming frog has just absolutely amazing customer service.

I highly recommend their software and it’s worth every penny.

Everyone have a great new year and we’ll see you tomorrow. .