Scrape Wayback for AU domain name and first archived date

Cancelado Publicado hace 6 años Pagado a la entrega
Cancelado Pagado a la entrega

Unless one of the Freelancer community knows different, there doesn't seem to be a way of querying the AU TLD whois records for the creation date of Australian domain names, even with DIG or nay other commandline tools. My imperfect solution is to harvest the earliest Wayback archive entry .

Image 1 shows the two data elements I need to extract from the zonefile csv I have : (1) the domain name (2) the date the domain name was first archived in Wayback.

Using a script I found in the Wayback APIs I have built a [clunky] batch script [[login to view URL], attached] that captures data into a series of files that I batch rename to a csv [[login to view URL] example attached].

Each capture from [login to view URL] produces a file with the oldest archive date on line 1. The domain name is obvious from each line, the date is in the YYYY:MM:DD:HH:MM:SS format.

It will need a chunk of regex written into the script to filter off the first line of each successful data capture, clean up time data element to readable format (eg 19981202014938 becomes 02/12/1998) and appends the domain name and date to an external csv file.

Some of the domains have no entry in Wayback but the script will still need to write the URL to the csv with a 'nul' value as the date element. so I can see which have no Wayback records.

The script will preferably draw the urls from an external text file.

The successful bid can use the URLs listed in the '[login to view URL]' file attached as a test list for the script. I dont need the scrape, just the script (there are 2.1 million domain names to query).

Any other questions, DM me.

Some resources:
https://github.com/internetarchive/wayback/blob/master/wayback-cdx-server/README.md#basic-usage

https://blog.archive.org/developers/
https://blog.archive.org/2013/07/04/metadata-api/
https://archive.org/help/json.php

Autobids will be deleted if the proposal is not read, and acknowledged within 12 hours...

Reissuing this project as it should have been listed in AUD not USD. check for new project just listed.

Entrada de datos Excel JavaScript JSON Shell Script

Nº del proyecto: #14356249

Sobre el proyecto

6 propuestas Proyecto remoto Activo hace 6 años

6 freelancers están ofertando un promedio de $155 por este trabajo

Venkat2011sri

Hi, I am working as a freelancer since 12 years and completed 1500 projects. I assure you 100% accuracy in the delivered work. I look forward to work with you. Relevant Skills and Experience Data Extraction Proposed Más

$166 USD en 3 días
(184 comentarios)
6.8
ChinmoySarker

Hi, Being attracted with your declaration of the program, I feel tempted to have the chance to make your work complete carefully and sincerely. I would like at present to have your kind mind and as soon as possible. Más

$100 USD en 3 días
(28 comentarios)
4.6
vietdevteam

I have read your project. I'm sure i can help you to do it. I have completed many projects similar to this project. Relevant Skills and Experience I am expert in web scraping. I have created many scraping tools. I h Más

$150 USD en 1 día
(8 comentarios)
3.9
huongth

Hi. I am an expert in VBA, VBScript, Visual Basic, C#, F#, C, C++, ASM, Delphi, Java, iMacros, Flash, ASP, ASP.NET, Access, MySQL, MSSQL, QuickBooks, Oracle. I can create auto scripts to scrape websites, auto click, fo Más

$150 USD en 3 días
(16 comentarios)
3.7