Find Jobs
Hire Freelancers

Extract genealogy data from webpages using XSLT2.0

$500-750 USD

Terminado
Publicado hace más de 17 años

$500-750 USD

Pagado a la entrega
**Intro:** A lot of people post family histories on the Web. Although there are millions of such webpages, most are generated by one of about 50 different "genealogy->HTML" generators. Each generator uses a consistent HTML tag layout, so it is possible to extract the original genealogy data (names, dates, and places) from the webpages generated by a specific generator using an XSLT script. **Deliverables:** This project is to write XSTL scripts for 10 of the most common generators. If you do a good job on these first 10 and get them done quickly, there will be follow-on projects for additional scripts. (I need to have all 50 scripts completed by December 15.) You are given 2-4 example URLs for each generator. For each generator you need to download the example webpages, run them through TagSoup, and write an XSLT2.0 script that extracts the genealogical information (names, dates, and places for the individual and his/her parents, spouse, and children) for each individual listed on the webpage and outputs it into an XML file. The required format of the output XML file is given as a Relax-NG (compact form) schema. The XSLT2.0 script must be executable by Saxon 8.0. **Additional Notes:** You do not need to follow any links; just extract the required data from the example URLs. Your script should generalize well to unseen pages from the same generator. That is, it should handle a varying number of children, degrade gracefully when dates/places are not given, etc. Some of the example XSLT scripts (see below) handle webpages in multiple languages. Your scripts need to handle only webpages in English. **Example scripts:** To help you get a better idea of what is needed, you are given five example XSLT scripts. You can use these five examples to see what type of information is required to be output for each page, as well as for source code that you can modify for your own XSLT scripts. ## Deliverables 1) Complete and fully-functional working program(s) in executable form as well as complete source code of all work done. 2) Deliverables must be in ready-to-run condition, as follows (depending on the nature of the deliverables): a) For web sites or other server-side deliverables intended to only ever exist in one place in the Buyer's environment--Deliverables must be installed by the Seller in ready-to-run condition in the Buyer's environment. b) For all others including desktop software or software the buyer intends to distribute: A software installation package that will install the software in ready-to-run condition on the platform(s) specified in this bid request. 3) All deliverables will be considered "work made for hire" under U.S. Copyright law. Buyer will receive exclusive and complete copyrights to all work purchased. (No GPL, GNU, 3rd party components, etc. unless all copyright ramifications are explained AND AGREED TO by the buyer on the site per the coder's Seller Legal Agreement). ## Platform You can run the scripts on any platform that Saxon 8 and TagSoup run on. (Both run on Java.) You can download Saxon-B 8.8 from [login to view URL] You can download the TagSoup jar from [login to view URL]~cowan/XML/tagsoup/.
ID del proyecto: 3843042

Información sobre el proyecto

6 propuestas
Proyecto remoto
Activo hace 18 años

¿Buscas ganar dinero?

Beneficios de presentar ofertas en Freelancer

Fija tu plazo y presupuesto
Cobra por tu trabajo
Describe tu propuesta
Es gratis registrarse y presentar ofertas en los trabajos
Adjudicado a:
Avatar del usuario
See private message.
$425 USD en 17 días
4,9 (26 comentarios)
6,1
6,1
6 freelancers están ofertando un promedio de $494 USD por este trabajo
Avatar del usuario
See private message.
$425 USD en 17 días
5,0 (16 comentarios)
6,1
6,1
Avatar del usuario
See private message.
$425 USD en 17 días
5,0 (33 comentarios)
5,1
5,1
Avatar del usuario
See private message.
$637,50 USD en 17 días
5,0 (39 comentarios)
5,2
5,2
Avatar del usuario
See private message.
$501,50 USD en 17 días
4,8 (26 comentarios)
5,0
5,0
Avatar del usuario
See private message.
$552,50 USD en 17 días
4,3 (8 comentarios)
3,0
3,0

Sobre este cliente

Bandera de UNITED STATES
Saint Paul, United States
5,0
39
Miembro desde may 20, 2005

Verificación del cliente

¡Gracias! Te hemos enviado un enlace para reclamar tu crédito gratuito.
Algo salió mal al enviar tu correo electrónico. Por favor, intenta de nuevo.
Usuarios registrados Total de empleos publicados
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Cargando visualización previa
Permiso concedido para Geolocalización.
Tu sesión de acceso ha expirado y has sido desconectado. Por favor, inica sesión nuevamente.