spark scala nlp specialities to optimize my code

Cerrado Publicado hace 5 años Pagado a la entrega
Cerrado

Hi,

I wrote a apache spark scala program to find tf-idf using corpus, It's hanging on at point near group by statement. I want someone can fix that issue.

I have list of articles stored in s3 as parquet, so first I'm reading it as dataframe and creating n-grams and keeping it in one hand.

On other hand, s3 has 10k posts (corpus) as parquet. I'm reading it as dataframe and keeping it.

So now I want to find document frequency for each term (n-gram) against corpus

[Removed by Freelancer.com Admin]

I have written, I'm willing to share it to the right candidate

Word2Vec knowledge is plus

Lenguajes naturales Scala Spark

Nº del proyecto: #18112853

Sobre el proyecto

2 propuestas Proyecto remoto Activo hace 5 años

2 freelancers están ofertando un promedio de $5 / hora por este trabajo

RigelData

Hi, Hope you doing great. Rigeldata Solutions offers custom application development and maintenance services using Open Source Software (OSS) solutions. We have strong practice in Apache Spark, Apache NIFI, Kylo, Más

$5 USD / hora
(0 comentarios)
0.0