Find Jobs
Hire Freelancers

Automatic imports from Wikimedia

$750-1500 USD

En curso
Publicado hace más de 8 años

$750-1500 USD

Pagado a la entrega
I'm working on a project that creates fingerprints for images and videos. The code for this is available as [login to view URL] and is undergoing constant development. What I'm looking for now is to use this blockhash algorithm to create a database of hashes for images and videos available from Wikipedia (especially Wikimedia Commons), and keep this database updated as works are added/removed/updated from Wikipedia. This project creates the Python scripts that can run in the background to keep the local database updated against Wikipedia. We have done a similar program which is available here: [login to view URL] but it has not been updated, and had some bugs. It also did not really have any support for removing or updating works as they changed in Wikipedia. The contractor who continues this work may choose to work with this existing code base, or start anew (starting from scratch might be preferable). The program is supposed to consist of two parts: a server and a client, that interact with each other in a way that the contractor can define. Previously we've used RabbitMQ or interactions through the PostgreSQL database with almost equal success. The server is responsible for: - Interacting with the Wikimedia API, finding works that are newly added, has been removed, or been updated. - Adding any such works to a "queue" for further processing - Monitoring the client(s) work (for instance, if a work can not be processed, and clients have tried it 3-4 times with different times in between, marking the work as "error" so that it doesn't get processed more) The client is responsible for: - Retrieving works from the queue to process - Getting information from the Wikimedia API about: -- The title -- The copyright statement (Creative Commons or similar) -- The author (name) -- The available media files (image or video files) -- For each media file: --- The URL of the media file --- The blockhash of the media file (calculated by the blockhash command listed previously) The exact information retrieved should only be the basic information about a work. You can see for instance [login to view URL] for examples from a previous project of what information we stored about each work. The server and client should both be constructed in a way that other sources of information, such as Flickr, could easily be added later (the logic would be the same, but the exact API calls etc would change). The retrieved information should be stored in a PostgreSQL database.
ID del proyecto: 8881685

Información sobre el proyecto

1 propuesta
Proyecto remoto
Activo hace 8 años

¿Buscas ganar dinero?

Beneficios de presentar ofertas en Freelancer

Fija tu plazo y presupuesto
Cobra por tu trabajo
Describe tu propuesta
Es gratis registrarse y presentar ofertas en los trabajos

Sobre este cliente

Bandera de SWEDEN
Gnesta, Sweden
5,0
9
Forma de pago verificada
Miembro desde may 10, 2015

Verificación del cliente

¡Gracias! Te hemos enviado un enlace para reclamar tu crédito gratuito.
Algo salió mal al enviar tu correo electrónico. Por favor, intenta de nuevo.
Usuarios registrados Total de empleos publicados
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Cargando visualización previa
Permiso concedido para Geolocalización.
Tu sesión de acceso ha expirado y has sido desconectado. Por favor, inica sesión nuevamente.