SEO Office Hours with John Mueller of Google from January 7th 2022 which ran just under an hour. There was somewhat of a theme this office hours with a lot of questions around internationalization and language, subdomains and subfolders, and of course indexing and crawl budget.
As always, a synopsis of the question and answer, my opinion, and a link to the question in the video. I’ll skip questions that didn’t really have an answer or were so unique I don’t think it would apply to most marketers.
We started off with another crawl budget question. Blogging consistently compared to someone who doesn’t, will Google crawl at a different rate? Does consistency have anything to do with ranking?
John did say that it doesn’t. Being able to crawl and index a website is definitely a ranking factor. He goes on to say if you are comparing one page a week vs one page a day, the difference is trivial. Paraphrasing, if it’s one, ten, or ten thousand, Google can crawl in a reasonable timeframe and crawl budget shouldn’t be a concern.
If I post less often and less consistently, will Google pull my sitemap less frequently. The answer is yes.
There are two types of crawls that Google does.
1 – Discovery crawl – where Google looks for new pages
2 – Refresh crawl – where Google looks to update existing pages that they already know about
Expect refresh crawls to the homepage daily, or hourly depending on the site. If new links are found, then the discovery crawler will go look at those new pages. Google is smart enough to not visit pages that don’t change often, and so it isn’t a sign of quality or ranking, but just that they don’t need to visit the page as often.
Google will not penalize you for having duplicate content on your site as in a category page that has an excerpt from your blog, which would be exactly the same as your blog itself. Google knows that’s going to happen.
This was a really short question but I think it’s something we deal with often. If you have an older piece of content and you are updating it, should you repost it, or update the old post. One thing John said was don’t just update it so it’s a new date without making content changes. He encouraged the questioner to simply update the old post.
HREFLang again. The question was about a site that was doing well in one language, but they created a new domain and English language in order to target globally. How should the HREFLANG tag be handled?
The only time you should use it, is if you have the equivalent page in another language. It’s a page by page basis, so if you don’t have the same page on the other language site, you shouldn’t add the tag.
John said that for the hreflang, the ranking stays the same, but they swap out the page for the most “best fitting” one. This works across different domains and also on the same domain.
Page experience. Are there more than just the three that we can see in Search Console. Those being Core Web Vitals, Mobile Usability, and Page Experience.
Yes, they are but with the addition of intrusive interstitials.
What data is Google Chrome collecting from users that is used for rankings?
Just the Chrome User Experience data that is aggregated – it’s what users saw when they went to your website with regards to ONLY page experience.
I just had a conversation with small business owner and he was shocked that they didn’t use more data. I basically let him know that anything you do on the website has ZERO impact on your ranking. They only look at usability data in the form of core web vitals metrics. Everything else like bounce rate or time on page don’t actually matter.
And again, Google does NOT use Google Analytics data in ranking.
Remember, Google Analytics data is so easy to manipulate, Google isn’t going to trust it and not every site has it implemented and implemented it correctly. Stop worry about having Google analytics installed to rank. You can rank with a plain html text file. I’ve got examples.
International Targeting:
Started in the US with .com, and then expanded in the last 2 years into India, Australia, and Mexico. They’ve purchased ccTLDs, and are using them but the sites still have low authority. They don’t have HREFLANG tags setup, should they?
The tag itself won’t change the ranking, it would just make sure that the preferred version of the page is shown. If you are already ranking with the .com version, but not ranking for the localized version, you should install the hreflang tag so that it will swap the pages out in search results.
Updates to job related markup for direct reply. There was a notice in search console for those that used jobs related markup and there was a direct apply tag added to the markup. I’ve linked a good article about the update by search engine land in the show notes.
They moved from folders to subdomains for country based jobs. They 301’d but saw a large dropoff in traffic. They then implemented both folders and subdomains and asked if the canonical is the best method for these pages. Since there are two versions of the pages, and if there are millions of pages, that may start running into crawl budget issues.
So, subfolder vs subdomain. It’s been about 10 years since Monster.com moved subdomain to subfolders and published a huge case study on it. Each year it seems someone else does something similar and all have the same result. So, unless you are constrained by technology, just use subfolders. Brendan Huffard did one of the better writeups on this recently here at his SEO for the rest of us blog: https://seofortherestofus.org/seo/subdomain-vs-subdirectory/ It’s a short 2-3 minute read but sums up a few data points and aligns with what I think is general consensus. For the guys who asked this question, I think the issue isn’t Google, but simply the fact that subfolders work better than subdomains.
How to get another language version of a website into top stories. John mentioned page experience metrics would be the first place to look but there’s not other reasons or technical factors that would contribute. Any website should be able to be shown in top stories.
Since the number of features are increasing, how is search console calculating rankings?
Search console show the average TOP position, so if you are shown 3,4, and 5, then you are tracked as 3. Google search console rankings takes into account all the SERP features, so business profiles, even images will be counted.
Their JavaScript rendering has some forward slashes in it that GoogleBot is interpreting as a URL and they’re seeing errors in search console from this. Even though they have millions of pages, John reiterates that the discovery type searches, this shouldn’t be an issue with crawl budget. If it’s too much for the server, you can adjust your crawl rate which updates within about a day or so.
To answer the question about how to keep Google from following these URLs, the best way is to put that JavaScript into a file and block it with robots.txt, but you need to ensure the site still renders properly without that javascript.
Another question about subdomain and subfolder related to languages. John said it doesn’t really matter but take into consideration tracking, legal aspects, or requirements to have ccTLDs.
You know my opinion on this, I think subfolders are the best way to go.
Using rel=nofollow on links to a page to effectively no index it.
Nope, nofollow tells Google not to pass any PageRank to those pages, it doesn’t mean they won’t index the pages. I included an article on Search Engine Journal which references some comments John made about the tag being a hint only. If you want a page to not be indexed, then add a no-index tag.
https://www.searchenginejournal.com/why-google-turned-nofollow-to-hint/325713/
Another my page isn’t indexed yet question. It’s been a month on a landing page and all the request indexing etc., was done, but still nothing.
John said that not everything gets indexed.
Some things to consider – make sure all the technical side is looking good. Also, the quality of the site overall is a big factor, you may need to work on that, but that is a long term goal.
Pruning low quality content. What is the minimum traffic to keep an article?
Don’t go just off of traffic for this. Some pages just don’t get a lot of traffic but are important, especially seasonal pages.
Structured markup – specifically breadcrumbs, but the takeaway on this response was similar to his December 24, 2021 office hours where he says Google looks at what is visible on the page versus what you have marked up for structured data.
New extensive review updates, how is that affecting new e-commerce sites?
As user expectation changes, so does Google and this is an example of that. Then the question went on about native platform review functionality, and it seems that it doesn’t really matter so long as Google can find and read them..
They had a lot of content theft, and after having them removed via DMCA, they didn’t see any increase in ranking. John goes on to say that this wouldn’t affect ranking, there’s nothing in the algorithm about uniqueness of your page or site that boosts it for other more generic terms.
Google is pulling the wrong info for his site, specifically the dates are pulling the wrong year, and showing up in the SERPs. Their clickthrough has dropped and they are considering doing a data no snippet around the content, but wanted to know any other ideas.
They pull up dates in a variety of ways, but one of the things is to use the date structured data or as part of the event structured data so they’ll know it’s exact date from the markup. It could be picking up other numbers and thinking they are years.
I’ve seen this too many times where Google wasn’t sure if the number was an office, mobile, or fax number. And when someone searches, they end up calling a fax line or some corporate number instead of the local one simply because Google can’t tell from the HTML.