MenĂ¼

Email Grabber - FAQ: How can I ...
... crawl a portal or advertising market for email addresses?



Websites with many ads/articles with the same schema usually also have very similar URLs. It is best to check this beforehand in the browser (e.g. Internet Explorer ™) by looking at several pages; often bumps one on a scheme like this:
http://www.example.com/item/item004.htm.

By further checking the page, you can usually get a more precise or at least approximate Identify the range of numbers in which pages exist.
In this example, the pages .... /item001.htm to .... /item300.htm for the list of Add search files.
  • Menu: Search files ❯ Add ❯ Numbered URLs
  • Enter the URL with < C > as a placeholder for the number in this one Case:
    www.example.com/item/item < C > .htm
  • The start value here is 1, the end value is 300, step size is 1
  • Button: leading zeros ❯ 3 digits (around 001 ... 300 and not 1 ... 300 to receive)
  • On " Add URL (s) " click and close the window
  • Press the start button
     
  • In the dialog for start preparation:

    • Set the number of link tracking levels to 0, if the addresses are located directly on the selected pages. When they are on one with one click accessible page are on 1 etc.
    • Level from which only internal links are followed: 0
    • Press the "Start ❯" button in the dialog window

The 300 pages are searched for email addresses, then the search stops.

! If the option to delete duplicate addresses immediately is not selected , you can now delete duplicates using the delete button ("... Duplicate addresses") under the Remove email list.



❮ back
Copyright © 2022 Sven Bader