Community Information
-
How to obtain a large, random and representative sample of keywords and content summary
I know this isn't exactly SEO. In fact it's kind of reverse SEO, so I figured the wisdom of this sub would be relevant. Apologies if I am wrong about that and happy to be pointed towards a more appropriate sub. I am engaged in some research that needs to establish relative frequencies of different kinds of keywords. I don't want to do anything other than that - so no commercial angle at all. The steps involved are: 1/Access random URL 2/ Check content is English (if not, go to step 1) 3/ Check keywords/ content summary exists (if not, go to step 1) 4/ Extract keywords and or content summary 5/ Go to step 1 I'm scratching my head over two things. The first is how close can I get to obtaining a random sample of web pages. The second thing is when parsing html (which I have never studied or used) what are the tags used to indicate content summary? I have seen references to approved and deprecated tags - but I don't care what has been used, I just want to be able to find all instances of content summary. So what should I look for?2
© 2025 Indiareply.com. All rights reserved.