Directory Structure
Arkalos folder structure ensures that your project is well-organized, separating code to run from code to reuse, and helps you with growing your project, aligning teams, and easily manage modules, configuration, notebooks, scripts, data, and documentation.
You can utilize this structure for any project from basic customer research, academic project, a chatbot in the terminal, a personal assistant, to IoT, robotics and autonomous vehicles (AVs).
app/
_private/ # Git-ignored folder for personal or work-in-progress code
ai/ # AI and ML code
agents/ # AI agents (e.g., chatbots, assistants, robots, AVs)
environments/ # Environments where agents operate
evals/ # AI model evaluation tools
tasks/ # Tasks that agents can perform
trainers/ # Model training and fine-tuning modules
algorithms/ # Custom algorithms and computational logic
cli/ # Command Line Interface tools
core/ # Core app logic and Arkalos extensions
data/ # Data extraction, transformation, analysis, warehousing
analyzers/ # Data analysis modules (e.g., classification, clustering)
extractors/ # Data source connectors and extraction tools
transformers/ # Data cleaning, normalization, and transformation tools
types/ # Custom data types (data contracts)
visualizers/ # Data visualization, charts, graphs and plots
warehouse/ # Data warehouse setup and loaders
http/ # HTTP servers, APIs, dashboards, and microservices
jobs/ # Background tasks, queues, and cron jobs
sensors/ # Interfaces for hardware data collection (e.g., vision)
utils/ # Utility functions and helpers
workflows/ # Multi-step workflows (e.g., data pipelines, automation)
ai/ # AI-specific workflows (config, training, evaluation)
etl/ # Data extraction, transformation, and loading workflows
experiments/ # Scientific experiments and hypothesis testing
gen/ # Data generation workflows for testing
processes/ # Business processes and other automation workflows
config/ # Configuration files with Python logic, works with .env
data/ # Git-ignored data folder, used manually and by Arkalos
drive/ # Main storage for raw data, PDFs, CSVs, images, etc.
dwh/ # Data warehouse data. Auto-generated schema and cache
gen/ # Automatically generated data for testing
keys/ # Secret keys for API authentication and services
logs/ # Logs generated by Arkalos (`arkalos-<year>-<month>.log`)
models/ # Trained AI models saved here
docs/ # Project documentation (usage, modules, contributions)
notebooks/ # Jupyter Notebooks for exploration and prototyping
_private/ # Git-ignored folder for personal notebook drafts
scripts/ # Stand-alone executable scripts
_private/ # Git-ignored folder for personal scripts
ai/ # Scripts to train models or run AI agents
cli/ # Command Line Interface scripts
etl/ # Data workflows: extraction, transformation, loading
experiments/ # Prototyping, exploration, and scientific experiments
gen/ # Scripts to generate test data
http/ # Serve APIs, microservices, dashboards, or web apps
jobs/ # Scheduled tasks, background jobs, and cron jobs
processes/ # Business processes and automation scripts
tests/ # Unit and other tests for your code
Private code inside _private
folders
In the Arkalos project, various folders named _private/
are gitignored.
This means when you share your code with the team and push it to a Git repository (e.g., GitHub), any files inside these _private
folders won’t be shared or committed.
If you're working alone, you can ignore such folders.
In a team setting, use it to create preliminary versions of your code or work-in-progress files without worrying about accidentally committing them.
Code to Run: notebooks/
and scripts/
All runnable code should be placed in either the notebooks/
or scripts/
folders.
Typically, you start by exploring data, experimenting, or prototyping in a Jupyter Notebook.
Once ready, convert your work into stand-alone scripts that you and your team can run from the terminal.
Code to Reuse and Consume: app/
Place any code you want to reuse across notebooks or scripts — such as functions, classes, modules, or packages — in subfolders inside the app/
directory.
Avoid having variables in the global scope, and ensure no code runs on import.
Code to Run vs. Code to Reuse:
Organize your code as follows:
notebooks/
– Jupyter Notebooks for exploration and prototypingscripts/
– Stand-alone scripts to executeapp/
– Reusable code (functions, constants, modules, classes, packages)
Example of code that belongs in scripts/
:
Example of reusable code that belongs in app/
:
Other Primary Folders
In addition to notebooks/
, scripts/
, and app/
, the root of your project includes:
config/
– Configuration files that work with the.env
file. These are actual Python files and can include conditional logic, unlike simple text-based configs.data/
– Upload raw data, analysis files, or secret keys here. Arkalos will also automatically store files such as data warehouse contents or trained models. This folder is git-ignored.docs/
– Optional: Document your project, including module descriptions, usage instructions, and contribution guidelines.tests/
– Write tests to ensure your code works as expected.
Subfolders
Let’s dive into the subfolders inside app/
, data/
, and scripts/
.
app/
This folder contains reusable code organized into:
app/ai/
– AI and ML code, including agents, environments, tasks, model trainers, and evaluations.app/algorithms/
– Custom computational logic when standard libraries aren’t enough.app/cli/
– Custom Command Line Interface tools and commands.app/core/
– Core, initialization, or bootstrapping logic, and Arkalos extensions.app/data/
– Data contracts (types), extraction, transformation, analysis, and warehousing.app/http/
– Expose your project as an HTTP API, microservice, or full web UI/dashboard.app/jobs/
– Background tasks, queues, and cron jobs.app/sensors/
– Interface with hardware to collect data (e.g., vision sensors).app/utils/
– Utility functions extending Python's standard capabilities.app/workflows/
– Multi-step workflows like data pipelines, business processes, or automation.
app/ai/
Every AI agent will typically include an Agent
class and a Task
it can perform.
app/ai/agents/
– AI agents, from simple chatbots to complex robotics and autonomous vehicles.app/ai/environments/
– Environments where agents operate and collect data, often through sensors.app/ai/evals/
– Tools and modules for evaluating AI models.app/ai/tasks/
– Tasks, tools, or skills that agents can perform.app/ai/trainers/
– Modules for training and fine-tuning AI models.
app/data/
app/data/analyzers/
– Modules for exploring and analyzing data (e.g., classification, clustering).app/data/extractors/
– Data sources, connectors, and extraction tools.app/data/transformers/
– Tools for data cleaning, normalization, and advanced transformations.app/data/types/
– Custom data types (data contracts)app/data/visualizers/
– Data visualization, charts, graphs, plots.app/data/warehouse/
– Your custom data warehouse setup and loaders.
app/workflows/
app/workflows/ai/
– AI workflows, including model configuration, training, and evaluation.app/workflows/etl/
– Data workflows and pipelines, covering extraction, transformation, and loading.app/workflows/experiments/
– Scientific experiments, hypothesis testing, and advanced workflows.app/workflows/gen/
– Workflows to generate basic data, usually for testing.app/workflows/processes/
– Business processes and other automation workflows.
data/
All files in data/
are git-ignored.
Most subfolders are used automatically by Arkalos, but you’ll manually manage files in drive/
and keys/
.
For sharing files with your team, use cloud storage (e.g., Google Drive) or a separate repository.
Subfolders:
data/drive/
– Your main data storage, similar to a personal drive or cloud storage. Store PDFs, CSVs, images, videos, raw data, or training datasets here.data/dwh/
– The data warehouse. By default, SQLite is used, and its schema, cache, and data are auto-generated here.data/gen/
– Automatically generated data, typically for testing purposes.data/keys/
– Secret keys for services like Google Cloud API or enterprise servers.data/logs/
– Logs generated by Arkalos. Default file format:arkalos-<year>-<month>.log
.data/models/
– Save trained AI model outputs here.
scripts/
Scripts often import and run workflows or services from the app/workflows/
directory. They are Code to Run, while app/
contains Code to Reuse.
Subfolders:
scripts/ai/
– AI-related scripts, like training models or running agents.scripts/cli/
– Command Line Interface scripts for managing your workspace.scripts/etl/
– Data extraction, transformation, and loading workflows.scripts/experiments/
– Scripts for exploration, prototyping, and hypothesis testing.scripts/gen/
– Scripts to generate basic or testing data.scripts/http/
– Serve your app as an internal API, microservice, dashboard, or public web server.scripts/jobs/
– Background tasks, queues, and cron jobs that run on a schedule.scripts/processes/
– Business process automation and other workflows.
Configuring
Now that you know where to organize your files, the next step is understanding the configuration files included with Arkalos, such as inside config/app.py
, .env.example
, .env
and the app/bootstrap.py
.
Read next: Configuration & Env.