spark scala nlp specialities to optimize my code
Cerrado
Publicado
hace 5 años
Pagado a la entrega
$2-8 USD / hora
Cerrado
Hi,
I wrote a apache spark scala program to find tf-idf using corpus, It's hanging on at point near group by statement. I want someone can fix that issue.
I have list of articles stored in s3 as parquet, so first I'm reading it as dataframe and creating n-grams and keeping it in one hand.
On other hand, s3 has 10k posts (corpus) as parquet. I'm reading it as dataframe and keeping it.
So now I want to find document frequency for each term (n-gram) against corpus
[Removed by Freelancer.com Admin]
I have written, I'm willing to share it to the right candidate
Word2Vec knowledge is plus
Nº del proyecto: #18112853
Sobre el proyecto
2 propuestas
Proyecto remoto
Activo hace 5 años