llm-tool
LLMTool eliminates the guesswork and frustration of deploying LLM inference by providing production-ready vLLM configurations and GPU sizing recommendations in a single API call. Instead of spending hours researching GPU requirements, reading conflicting forum posts, and dealing with OutOfMemory crashes, developers simply input any model name from major open-source families and receive complete…
llm-tool endpoints
| Method | Endpoint | Description |
|---|---|---|
| GET |
/v1/service-info /v1/service-info |
Get comprehensive information about service capabilities |
| GET |
/health/ready /health/ready |
Readiness check for deployment orchestration |
| GET |
/v1/supported-models /v1/supported-models |
Get a list of all supported models across all families |
| POST |
/v1/command /v1/command |
**This endpoint provides**: - Complete vLLM command string ready for execution - Same parameter calculation logic as /v1/calculate endpoint - Exact format matching deployment… |
| POST |
/v1/calculate /v1/calculate |
**This endpoint provides**: - GPU resource recommendations (type, count, memory) - Complete vLLM parameter configuration optimized for the model - Deployment complexity… |
| GET |
/health/status /health/status |
Lightweight status endpoint optimized |
llm-tool pricing
| Plan | Price | Rate limit | Quotas |
|---|---|---|---|
| BASIC Recommended | Free | — |
|