1

I have got a very long post that got very popular and I am contemplating converting it into a distributable ebook.

Should I be worried about content duplication?

unor
  • 21,739
  • 3
  • 46
  • 117
Anon
  • 669
  • 4
  • 11
  • Google does understand that there will duplication between HTML and PDF files. Though Google does parse PDF and other file formats and indexes it, Google also understands that these files are not web content but alternative options for the user. You can always link it with nofollow, noindex and if you chose to, you can restrict it using robots.txt. Otherwise, there may be other ideas here. I know this question gets brought up from time to time. Congratulations on the post!! Success such as this is not as easy as people like to think. – closetnoc Feb 05 '15 at 16:00
  • Thank you @closetnoc my biggest concern is not the pdf on my server as we can control it's appearance in SEs, it's the pdf uploaded at other websites (for example, the university websites) as a reference or just for an easy access. – Anon Feb 05 '15 at 16:24
  • 1
    Excellent point! That had not occurred to me. I did a quick search on PW.SE and did not find anything to answer your question directly. However, this first link does suggest creating links back to your site which may help. Here are two links for general info: http://searchengineland.com/eleven-tips-for-optimizing-pdfs-for-search-engines-12156 and http://webmasters.stackexchange.com/questions/74643/how-to-specify-the-canonical-url-for-downloadable-versions-of-the-page-doc-pdf I will look around a bit later today to see if I can find something more concrete. – closetnoc Feb 05 '15 at 16:42
  • 1
    So far I am not seeing much. The advice to fill out the PDF document header with author info seems to be the best option so far. I have business to do today so I will be out for quite a while. But you got me thinking. I will try looking some more. All I am seeing is local to your server including putting in a canonical link on your site and what I told you already. Here is one link on that: http://www.bowlerhat.co.uk/prevent-pdf-articles-becoming-duplicate-content/ which lists a couple of options. – closetnoc Feb 05 '15 at 16:56

0 Answers0