Artificial intelligence
for Brazil

Performance on Brazilian benchmarks

Our models

Sabiazinho - 4

Lightweight, fast and economical: ideal for simple, real-time tasks.

Sabiá - 4

More intelligent and precise for demanding applications.

Long document analysis

Sabiá reads and interprets PDFs and long texts, summarizing the most relevant information with speed and precision.

Understanding complex topics

Turns difficult subjects into clear, accessible explanations adapted to your level of knowledge.

Automated data analysis

Connects and interprets data from multiple sources to generate fast, organized and actionable insights.

Frequently asked questions

How do I use the Sabiá models? +
Go to chat.maritaca.ai or download the Android/iOS app. For API usage, see the documentation at docs.maritaca.ai.
What can the models do? +
Sabiá models are specialized in Portuguese and Brazilian contexts.

Main capabilities include: analyzing and summarizing long documents (PDFs, contracts, legislation), deep understanding of Brazilian legal, educational and institutional topics, function calling for system integration, web search, and Portuguese text generation with high fluency.

For benchmark performance details, see the models page in the documentation.
Can I fine-tune the models on my own data? +
We don't currently offer fine-tuning. For use cases requiring customization, we recommend using RAG (Retrieval-Augmented Generation) with our models, which usually covers most scenarios. See our RAG guide in the documentation.
How are the models trained, and what is the architecture? +
The models are based on the Transformer architecture. The exact parameter count is not publicly disclosed. The Sabiá 4 family was trained from a generalist base model, with continual learning in four stages: (1) continued pre-training on a large Portuguese corpus and a Brazilian legal corpus, (2) long-context extension to 128k tokens, (3) supervised fine-tuning on chat, code, legal tasks, instruction-following and function-calling data, and (4) preference alignment. Pre-training data goes through a quality-filtering, relevance-scoring and document-rewriting pipeline. In addition to public data, the models were also trained on proprietary commercial data.
Is my data used for training? What is the data retention policy? +
API: all data sent to our servers is immediately discarded after the response is generated. We only store the token count for billing purposes.

Web Chatbot and Mobile Apps: conversation data is stored so users can access them later. However, this data is not used for training, regardless of whether the user is on a paid or free plan. User privacy and security are maintained at all times.
Are the models available for local (on-premise) use? +
Currently, Sabiá 4 models are only available via API. We don't offer weights for download or licensing for on-premise use. All data sent to the API is discarded immediately after the response is generated and is not used for training.
Do you have embedding models for Retrieval-Augmented Generation (RAG)? +
We don't currently offer our own embedding model. We recommend using multilingual models such as Multilingual-E5-large. See our RAG guide in the documentation for a complete integration example.
What cloud infrastructure does Maritaca use? +
Our models run on GPUs at Oracle Cloud, Amazon AWS, Google Cloud, and Verda. Training is mostly performed on TPUs at Google Cloud.
Do you offer plans for companies? +
Yes. For companies that need a contract, dedicated SLA, custom billing or priority support, please fill out the form at maritaca.ai/en/api.

For direct API usage, just create an account at plataforma.maritaca.ai — no need to talk to our team.
What is the SLA and uptime guarantee for the API? +
API availability can be tracked in real time at status.maritaca.ai. For specific SLA needs or enterprise contracts, contact info@maritaca.ai.
How do I integrate Maritaca into my system? +
The Maritaca API is compatible with the OpenAI library. If you already use OpenAI, just change the base_url to https://chat.maritaca.ai/api and use your Maritaca key. For more details, see docs.maritaca.ai.
What are the API rate limits? +
Request limits vary by plan. For up-to-date details, see the documentation at docs.maritaca.ai/pt/rate-limits or the platform at plataforma.maritaca.ai/rate-limits.
Do the models support function calling? +
Yes. Sabiá 4 family models support function calling, letting the model invoke functions you define to integrate with external systems, APIs and databases. Full documentation with examples is at docs.maritaca.ai/pt/chamada-funcao.
Can I use the API to process documents (PDFs, images)? +
Yes. The API accepts file attachments such as PDFs and images alongside the message. Text content is extracted automatically and passed to the model. This enables analysis, summarization and Q&A over documents directly via the API. See docs.maritaca.ai/pt/files.