After a working model of my project was ready, it was now time to test it for production. And the best way before deploying the web application is to test it locally on Docker.
Before setting up the docker, I was finding out ways on how can I reduce the size of my application since I was running an open-source model (BAAI/bge-m3). After researching a little, I got to know that I can run this model via API through Hugging Face. All that is needed is a Hugging Face API token.
After testing this out, I installed Docker. Docker uses Linux under the hood. It containerizes the whole of your application (kind of like packaging your whole application and moving it to an alien world) and runs it on an isolated container. If you application works error-free here, it most probably would run error-free on cloud. Since we needed 2 containers (1 for frontend and 1 for backend), we needed to define 2 Dockerfiles. Here comes Docker Compose. It is a single-file used to defining, initializing and running your containers in one-go. Now, to build both of my containers, all I needed to do was run this command: docker compose build, and done.
Once this was done, I deployed these 2 docker containers to Google Cloud. It gives you free $300 in credits. I used gcloud via CLI (Command Line Interface) and hence it was easy to make changes in my application and re-deploy again using the same commands. You have to be cautious about your secrets, region and other sensitive parameters that go in.
Using BAAI/bge-m3 via Hugging Face:
async with httpx.AsyncClient() as client:
for attempt in range(max_retries):
try:
response = await client.post(
api_url,
headers=self.hf_headers,
json={"inputs": text},
timeout=10.0
)
if response.status_code == 200:
embedding = response.json()
logger.info(f"✅ Embedding generated: {len(embedding)} dimensions")
return embedding
Comments
Post a Comment