Skip to content

Directory Structure

Arkalos folder structure ensures that your project is well-organized, separating code to run from code to reuse, and helps you with growing your project, aligning teams, and easily manage modules, configuration, notebooks, scripts, data, and documentation.

You can utilize this structure for any project from basic customer research, academic project, a chatbot in the terminal, a personal assistant, to IoT, robotics and autonomous vehicles (AVs).

app/
    _private/               # Git-ignored folder for personal or work-in-progress code
    ai/                     # AI and ML code
        agents/             # AI agents (e.g., chatbots, assistants, robots, AVs)
        environments/       # Environments where agents operate
        evals/              # AI model evaluation tools
        tasks/              # Tasks that agents can perform
        trainers/           # Model training and fine-tuning modules
    algorithms/             # Custom algorithms and computational logic
    cli/                    # Command Line Interface tools
    core/                   # Core app logic and Arkalos extensions
    data/                   # Data extraction, transformation, analysis, warehousing
        analyzers/          # Data analysis modules (e.g., classification, clustering)
        extractors/         # Data source connectors and extraction tools
        transformers/       # Data cleaning, normalization, and transformation tools
        types/              # Custom data types (data contracts)
        visualizers/        # Data visualization, charts, graphs and plots
        warehouse/          # Data warehouse setup and loaders
    http/                   # HTTP servers, APIs, dashboards, and microservices
    jobs/                   # Background tasks, queues, and cron jobs
    sensors/                # Interfaces for hardware data collection (e.g., vision)
    utils/                  # Utility functions and helpers
    workflows/              # Multi-step workflows (e.g., data pipelines, automation)
        ai/                 # AI-specific workflows (config, training, evaluation)
        etl/                # Data extraction, transformation, and loading workflows
        experiments/        # Scientific experiments and hypothesis testing
        gen/                # Data generation workflows for testing
        processes/          # Business processes and other automation workflows

config/                     # Configuration files with Python logic, works with .env

data/                       # Git-ignored data folder, used manually and by Arkalos
    drive/                  # Main storage for raw data, PDFs, CSVs, images, etc.
    dwh/                    # Data warehouse data. Auto-generated schema and cache
    gen/                    # Automatically generated data for testing
    keys/                   # Secret keys for API authentication and services
    logs/                   # Logs generated by Arkalos (`arkalos-<year>-<month>.log`)
    models/                 # Trained AI models saved here

docs/                       # Project documentation (usage, modules, contributions)

notebooks/                  # Jupyter Notebooks for exploration and prototyping
    _private/               # Git-ignored folder for personal notebook drafts

scripts/                    # Stand-alone executable scripts
    _private/               # Git-ignored folder for personal scripts
    ai/                     # Scripts to train models or run AI agents
    cli/                    # Command Line Interface scripts
    etl/                    # Data workflows: extraction, transformation, loading
    experiments/            # Prototyping, exploration, and scientific experiments
    gen/                    # Scripts to generate test data
    http/                   # Serve APIs, microservices, dashboards, or web apps
    jobs/                   # Scheduled tasks, background jobs, and cron jobs
    processes/              # Business processes and automation scripts

tests/                      # Unit and other tests for your code

Private code inside _private folders

In the Arkalos project, various folders named _private/ are gitignored.

This means when you share your code with the team and push it to a Git repository (e.g., GitHub), any files inside these _private folders won’t be shared or committed.

If you're working alone, you can ignore such folders.

In a team setting, use it to create preliminary versions of your code or work-in-progress files without worrying about accidentally committing them.

Code to Run: notebooks/ and scripts/

All runnable code should be placed in either the notebooks/ or scripts/ folders.

Typically, you start by exploring data, experimenting, or prototyping in a Jupyter Notebook.

Once ready, convert your work into stand-alone scripts that you and your team can run from the terminal.

Code to Reuse and Consume: app/

Place any code you want to reuse across notebooks or scripts — such as functions, classes, modules, or packages — in subfolders inside the app/ directory.

Avoid having variables in the global scope, and ensure no code runs on import.

Code to Run vs. Code to Reuse:

Organize your code as follows:

  • notebooks/ – Jupyter Notebooks for exploration and prototyping
  • scripts/ – Stand-alone scripts to execute
  • app/ – Reusable code (functions, constants, modules, classes, packages)

Example of code that belongs in scripts/:

scripts/cli/example.py
import app.utils.my_module as my_module

x = 5
print(my_module.my_func(x))

Example of reusable code that belongs in app/:

app/utils/my_module.py
def my_func(x):
    return x + 5

Other Primary Folders

In addition to notebooks/, scripts/, and app/, the root of your project includes:

  • config/ – Configuration files that work with the .env file. These are actual Python files and can include conditional logic, unlike simple text-based configs.
  • data/ – Upload raw data, analysis files, or secret keys here. Arkalos will also automatically store files such as data warehouse contents or trained models. This folder is git-ignored.
  • docs/Optional: Document your project, including module descriptions, usage instructions, and contribution guidelines.
  • tests/ – Write tests to ensure your code works as expected.

Subfolders

Let’s dive into the subfolders inside app/, data/, and scripts/.

app/

This folder contains reusable code organized into:

  • app/ai/ – AI and ML code, including agents, environments, tasks, model trainers, and evaluations.
  • app/algorithms/ – Custom computational logic when standard libraries aren’t enough.
  • app/cli/ – Custom Command Line Interface tools and commands.
  • app/core/ – Core, initialization, or bootstrapping logic, and Arkalos extensions.
  • app/data/ – Data contracts (types), extraction, transformation, analysis, and warehousing.
  • app/http/ – Expose your project as an HTTP API, microservice, or full web UI/dashboard.
  • app/jobs/ – Background tasks, queues, and cron jobs.
  • app/sensors/ – Interface with hardware to collect data (e.g., vision sensors).
  • app/utils/ – Utility functions extending Python's standard capabilities.
  • app/workflows/ – Multi-step workflows like data pipelines, business processes, or automation.

app/ai/

Every AI agent will typically include an Agent class and a Task it can perform.

  • app/ai/agents/ – AI agents, from simple chatbots to complex robotics and autonomous vehicles.
  • app/ai/environments/ – Environments where agents operate and collect data, often through sensors.
  • app/ai/evals/ – Tools and modules for evaluating AI models.
  • app/ai/tasks/ – Tasks, tools, or skills that agents can perform.
  • app/ai/trainers/ – Modules for training and fine-tuning AI models.

app/data/

  • app/data/analyzers/ – Modules for exploring and analyzing data (e.g., classification, clustering).
  • app/data/extractors/ – Data sources, connectors, and extraction tools.
  • app/data/transformers/ – Tools for data cleaning, normalization, and advanced transformations.
  • app/data/types/ – Custom data types (data contracts)
  • app/data/visualizers/ – Data visualization, charts, graphs, plots.
  • app/data/warehouse/ – Your custom data warehouse setup and loaders.

app/workflows/

  • app/workflows/ai/ – AI workflows, including model configuration, training, and evaluation.
  • app/workflows/etl/ – Data workflows and pipelines, covering extraction, transformation, and loading.
  • app/workflows/experiments/ – Scientific experiments, hypothesis testing, and advanced workflows.
  • app/workflows/gen/ – Workflows to generate basic data, usually for testing.
  • app/workflows/processes/ – Business processes and other automation workflows.

data/

All files in data/ are git-ignored.

Most subfolders are used automatically by Arkalos, but you’ll manually manage files in drive/ and keys/.

For sharing files with your team, use cloud storage (e.g., Google Drive) or a separate repository.

Subfolders:

  • data/drive/ – Your main data storage, similar to a personal drive or cloud storage. Store PDFs, CSVs, images, videos, raw data, or training datasets here.
  • data/dwh/ – The data warehouse. By default, SQLite is used, and its schema, cache, and data are auto-generated here.
  • data/gen/ – Automatically generated data, typically for testing purposes.
  • data/keys/ – Secret keys for services like Google Cloud API or enterprise servers.
  • data/logs/ – Logs generated by Arkalos. Default file format: arkalos-<year>-<month>.log.
  • data/models/ – Save trained AI model outputs here.

scripts/

Scripts often import and run workflows or services from the app/workflows/ directory. They are Code to Run, while app/ contains Code to Reuse.

Subfolders:

  • scripts/ai/ – AI-related scripts, like training models or running agents.
  • scripts/cli/ – Command Line Interface scripts for managing your workspace.
  • scripts/etl/ – Data extraction, transformation, and loading workflows.
  • scripts/experiments/ – Scripts for exploration, prototyping, and hypothesis testing.
  • scripts/gen/ – Scripts to generate basic or testing data.
  • scripts/http/ – Serve your app as an internal API, microservice, dashboard, or public web server.
  • scripts/jobs/ – Background tasks, queues, and cron jobs that run on a schedule.
  • scripts/processes/ – Business process automation and other workflows.

Configuring

Now that you know where to organize your files, the next step is understanding the configuration files included with Arkalos, such as inside config/app.py, .env.example, .env and the app/bootstrap.py.

Read next: Configuration & Env.