*App Name: Akash Automatic Speech Recognition
*Description: A dedicated automatic speech recognition to process any speech to text task. This idea is to provide a service similar to other cloud plateform.
A first version should be developed in english and in cpu. A voice activity detection should be integrated in order to automatically split different audio tracks.
The main engine would employ the hugging face library and the network would probably be WAV2VEC2. A web interface should be provided to either access files or record audio tracks. Later API could be integrated to use the service in other pipelines. When gpu support would be provided, the development should take advantage of the computing power.
Who will use this?
Anyone interested in the use of automatic speech recognition (interview, movie subtitle, etc). The main advantage would be to maintain it as the best state of the art and to access it without the need to send data to a third party.
How does it use Akash?
It uses Akash as the Automatic Speech Recognition Task could be highly hungry for computing power. Moreover, it’s often difficult to dedicate 100% time a server to this specific task. The ease of deployment in the Akash ecocystem would serve to deploy it easily when needed.
*Team: For the moment only me.
*GitHub:Lhemamou (l31bn1tz) · GitHub
Public Key: (optional)
About me : I have PhD in automatic speech processing. I can easily handle the machine learning part while struggling a bit on the front end part. Anyone who wants to join the project is welcomed.
An estimation of the project is 100~500 AKT.