Today I’ll be recapping the Google Office Hours with Google’s John Mueller from January 14, 2022. As always, I’ll provide a synopsis of the question and answer, my opinion, and a link to the question in the video. The video ran an hour, I’ll try to summarize this for you in about ten minutes.
David asks:
An over 1500 line robots.txt file which refers to HTML fragments and AJAX calls.
Are there any negative SEO effects with a large robots.txt file?
There aren’t any direct negative SEO effects, but John cautioned that a large file like that may be harder to maintain and could cause issues that you didn’t realize.
So, size matters I guess but more about how you use it.
If we don’t include the sitemap in the robots.txt, is there any negative SEO implications?
No. Any way of submitting a sitemap is equivalent to Google.
The next question gets into the main part of the issue, how does Google handle non user relevant HTML fragments, and AJAX calls. Do they get into the index?
John had a good response and it’s what I think works best, test it out. Allow a portion to get crawled and see how Google reacts to it, what content is now being shown and does that even matter. The amount of time spent in keeping Google from seeing these small things may not even matter.
Is there any guideline to the robots.txt file as far as file size. That would be a no. Here are some resources, it’s super easy to disallow the wrong one, and honestly, it’s way easier to get in trouble with a robots.txt file than you’d think. If you are technically inclined, the open source code of the robots parser is on GitHub – linked in show notes.
https://developers.google.com/search/docs/advanced/robots/intro
https://github.com/google/robotstxt
Heading Text: How important is it to use hierarchical vs just however you want. Does size matter? He looked at Google’s own blogs and how they’re structured.
“Do not assume just because Google does it that it’s the best way to do it.”
We use headings to understand the context of individual pieces of content. It can help if you have a hierarchy on the page. Usability and accessibility are also key reasons to use them.
I’ve found that using a single H1, and then multiple nested H2s and H3s really help make the page work well. With the passage update, I’ve found that my sites do well for a lot of those H2s and 3’s.
Adding a new domain for a new service offering. How long until we start seeing the traffic looking like our current domain.
There isn’t a fixed time for that kind of change. They are updating structure, content, and even offering on a new domain, so that takes time to process.
Remember that John mentioned on the last couple office hours how it’s all about the site quality as a whole, and those metrics take a while, and if you change everything, like your URL, pages, etc., then it kind of starts over. Just changing your domain will transfer your quality rating.
Let me just say that I love this upcoming question. It’s about custom content or rather a custom page that is geo-located by state. The content would be no-indexed so it wouldn’t be duplicate.
For most systems, Google crawls will map back to California. It would only see the California content, none of the other areas. If you no-indexed that page, then you would likely see your homepage drop out of the search results. That would be a bad thing.
John recommends generally in these situations that instead of redirecting, make it easier for the user to find that content. Whether it’s a dynamic banner, with local specific links. Those location pages would be indexable.
The other approach would be to swap out some sections with location dynamic content that would be in this case, state specific.
John went on to say that landing pages for states that are similar in content wouldn’t be problematic, but every city in every state that would look spammy, but if it’s just a handful of states you are fine.
So it’s ok to have 50 state based homepages, even if the information isn’t that different.
The questioner is seeing good conversion rates on their app, but not so much on mobile website. The question was about redirecting users to the app instead of keeping them on the page and how that might impact SEO.
1 – make sure not to redirect GoogleBot or your site won’t get indexed.
2 – you are not going to have a lot of data for core web vitals if you are redirecting people away from your site.
It does come down to usability, and giving your users a choice to go to the app or not.
And again, we ask about region specific subdomains. They wanted to know if the pages had to be exact but different currency and language. John said they don’t, they need to be equivalent, but there’s cases where a brand name may be different country by country. Google uses HREFLANG to determine the equivalent page from the site owner’s point of view, and they swap them out.
This question was about backlinks and removing spam backlinks once you hit the 2MB file size limit on the disavow.
John recommends to try to use the domain directive as much as possible. Also, don’t try to clean up all the links, it’s almost impossible. You should focus on pages that look like you may have paid for them. Don’t worry about links from spammy pages, copy pages, or random forum posts.
Do I need to mark up content on my page with HTML semantic elements for example, footer?
For some elements, it does give Google a bit more context about the content of your page.Things like footer don’t really give them much more info. Some amount of semantic HTML does make sense for SEO, usability, and browser compatibility.
Should we expect a drop in traffic if we remove AMP?
The short answer, No.
If you are moving from AMP pages that are very fast to slow pages non AMP, then you might see a difference. There aren’t any search features right now that are AMP only. And AMP is NOT a ranking factor. There’s a good Kinsta article about lead drops by using AMP, something like 59% drop in leads. Link in the show notes.
Do expect, like anything, a short transition period where there may be some disruption.
https://kinsta.com/blog/disable-google-amp/
We’ve got another disavow question. They’ve got a few thousand links that they’ve disavowed, but they’re worried they may have done some legit ones. Should they just remove the disavow completely and see what happens?
Go for it. For most sites, you don’t need a disavow file. It’s easy to get things wrong. If you want, try to incrementally remove portions and see how it works, you’ll probably find that you don’t need the file.
I’ve found that the only time you really need it is if you had a previous SEO who bought links or if you’ve got someone doing negative SEO with paid links against you. Otherwise, just let it be, you should be fine.
Seeing knowledge panel on mobile but not desktop and does Wikipedia influence when someone would be shown?
There shouldn’t be a difference between mobile and desktop, however some features are turned on and off due to real estate, so it may be the case.
There are many sources, and you can see them in the knowledge panels – so that’s a place to look.
Jason Barnard – https://www.youtube.com/watch?v=1OFPrIcv7_8 – mentioned by John, this is a great interview and may just lead you down a rabbit hole of knowledge panels.
Synonyms – The entire system is automated, there’s nothing manual.
10-15% of queries are completely new everyday, so this wouldn’t be possible.
Several locations, what is the best way to include local schema?
Having location pages with different schema on each of the pages with the address, phone, hours etc.
15-20 FAQs on my page, should I include all the questions, or just the ones I consider are important? If you want to. Really, you should make sure if it’s marked up, that it appears on the page, but otherwise, if you only want a handful that’s ok.
But, unless you have something you don’t want seen, it makes sense to have it all marked up, so you are giving the data to Google easier.
Can Google use content that is initially hidden, like accordions, dropdowns, etc. specifically for structured data. You are less likely to get featured snippets or people also ask on content that is initially hidden without user interaction. For FAQs the question has to be visible.
The questioner works on an ecommerce site that is considered an adult site by Google, and they’re no longer seeing adult sites but only generic ones like Amazon. There wasn’t an answer so they would be taking the question to chat and forwarding it to the safe search team.
A few questions about product reviews:
1 – In depth product review update rolled out to the US, when would it roll out to France or Germany? There is no timeframe, could be anytime.
2 – How were these made, human or software? It was all algorithmic approaches with a lot of machine learning.
We did it, a site not being indexed question! This one wasn’t bad though, I like where they went with it.
Coverage issue – crawl currently not indexed, discovered, not indexed.
What are some things to help get some of these pages indexed faster. The questioner offered a few ideas:
1 – Linking from the homepage
2 – Link from existing indexed pages
3 – More backlinks
John agreed, and added to make sure internal linking and overall site quality are up. You may want to concentrate the value into fewer pages.
The manufacturer publishes their product to 10-20 sites all with the same product description. Should I add unique content to my site?
Yes, but don’t just add fluff content,it should be content that adds value to the user. Don’t just fill extra text on the page with for example, category info from wikipedia.
That’s all for today, I’ll see you guys tomorrow!