Question:
Is there a way to search an entire site for instances of URLs in the text?
2010-12-16 01:28:32 UTC
Can you type in the homepage URL into a search engine or program or something and then it gives you a list of every URL listed in the text on every page within the website (publicly accessible pages only, of course)?

I don't know if you can use URLs for spam purposes but I have no intentions of trying to get URLs to spam. I can't even think of a way you can spam URLs but just in case, I wanted to make it clear I'm not planning to do that.
Five answers:
Keith C
2010-12-16 22:18:11 UTC
There is sitemaps generation website http://www.xml-sitemaps.com/ not sure this is the one you are looking for.

Hope this help.
2010-12-17 06:23:25 UTC
If you're looking at writing your own utility for this then the easiest way to accomplish searching is to use a regular expression.



http://en.wikipedia.org/wiki/Regular_expression



These are formal constructs that allow you to match specific strings. You could, for example, look for every instance of "www.*.*" (where * means anything), or you could look for something even more specific like "www.*.* but not www.*.edu".



After you can match (or follow) links it's just a matter of recursively hitting every link in a page.



If you know your project is in a specific language (like PHP or Perl) then you can look up a specific reference for those languages. Almost every major language has the ability to do Regular Expressions.
I ♡ Me
2010-12-16 09:31:54 UTC
This is probably the long way to do it but I would, after I'm already at a webpage, go to view then page source then ctrl+f and type www. And it'll show you where all the weblinks are at.
2010-12-17 06:23:17 UTC
google does that


This content was originally posted on Y! Answers, a Q&A website that shut down in 2021.
Loading...