Configuration
=============

Directory Layout
----------------

Elassandra packages Cassandra and the embedded OpenSearch runtime in one installation.
The top-level layout is:

* ``conf``: node configuration and logging
* ``bin``: startup scripts, administrative commands, and plugin tools
* ``lib``: packaged JVM dependencies
* ``modules``: packaged OpenSearch modules
* ``plugins``: installed OpenSearch plugins
* ``tools``: Cassandra tooling such as ``cassandra-stress`` and ``sstabledump``
* ``data``: keyspaces, commitlogs, hints, caches, and search data files
* ``logs``: node logs

Configuration files
-------------------

The primary operational configuration remains Cassandra-centric:

* cluster name, addresses, snitch, tokens, and replication are defined through Cassandra settings
* Elassandra derives the embedded OpenSearch node identity and network bindings from those settings
* index metadata is stored in Cassandra system keyspaces managed by Elassandra

In practice, operators should treat ``cassandra.yaml`` and the rack or dc configuration
as the source of truth for node identity and topology.

Logging configuration
---------------------

Elassandra logs Cassandra and embedded OpenSearch activity through Logback.
The main operational log is ``logs/system.log``.

For deeper indexing diagnostics, raise the log level for the Elassandra indexing classes,
for example under ``org.elassandra.index``.

Multi datacenter configuration
------------------------------

Each Elassandra datacenter participates in Cassandra replication first. Search visibility
then follows the replicated keyspaces and the Elassandra metadata stored in Cassandra.

Important operational rules:

* the indexed keyspace must be replicated into the datacenter before search can open there
* metadata updates require quorum on the Elassandra metadata keyspace
* nodes in the same datacenter should run the same runtime mode and plugin set

When only a subset of datacenters should expose particular indices, use Elassandra's
datacenter tagging settings so search metadata is only activated where intended.

Elassandra Settings
-------------------

Elassandra settings can be supplied at several levels:

* JVM system properties, often with the ``es.`` prefix
* cluster defaults
* index settings
* mapping metadata

Common settings used in current deployments include:

.. list-table::
   :widths: 30 30 40
   :header-rows: 1

   * - Setting
     - Scope
     - Purpose
   * - ``keyspace``
     - index
     - Select the backing Cassandra keyspace for an index.
   * - ``replication``
     - index
     - Define the Cassandra replication map for new keyspaces.
   * - ``datacenter_tag``
     - index
     - Restrict visibility of an index to tagged datacenters.
   * - ``table_options``
     - index
     - Apply Cassandra table options during schema creation.
   * - ``search_strategy_class``
     - index or cluster
     - Control how search work is distributed across replicas.
   * - ``synchronous_refresh``
     - index, mapping, or system
     - Refresh search data immediately after writes when needed.
   * - ``drop_on_delete_index``
     - index, cluster, or system
     - Drop backing tables when deleting an index.
   * - ``index_insert_only``
     - index, mapping, or system
     - Skip read-before-write for immutable-style documents.
   * - ``token_ranges_bitset_cache``
     - index or cluster
     - Cache token-range filters for repeated searches.

Sizing and tuning
-----------------

Elassandra nodes need more CPU and memory than Cassandra-only nodes because they handle
both storage and search work.

Write performance
.................

To improve write throughput:

* index only the fields you need
* use singleton-backed fields instead of lists when data is truly single-valued
* keep refresh settings conservative for heavy write workloads
* avoid large hot partitions and very wide rows

Search performance
..................

To improve search throughput:

* keep shard data balanced by maintaining a healthy Cassandra ring
* choose an appropriate Cassandra replication factor for your search fan-out profile
* enable token-range filter caching where repeated search patterns justify it
* use Cassandra row caching carefully when queries repeatedly fetch the same rows

For cluster-level operational guidance, also see :doc:`operations` and :doc:`limitations`.