#36517: Add Native Vector Support for Oracle: VectorField, VectorIndex, and
VectorDistance
-------------------------------------+-------------------------------------
     Reporter:  SAVAN SONI           |                    Owner:  (none)
         Type:  New feature          |                   Status:  new
    Component:  Database layer       |                  Version:  dev
  (models, ORM)                      |
     Severity:  Normal               |               Resolution:
     Keywords:                       |             Triage Stage:
                                     |  Unreviewed
    Has patch:  0                    |      Needs documentation:  0
  Needs tests:  0                    |  Patch needs improvement:  0
Easy pickings:  0                    |                    UI/UX:  0
-------------------------------------+-------------------------------------
Description changed by SAVAN SONI:

Old description:

> This feature adds native support for Oracle’s Vector:
> [https://docs.oracle.com/en/database/oracle/oracle-database/23/vecse
> /overview-ai-vector-search.html] data type introduced in Oracle 23c. It
> enables AI and ML applications to store and query high-dimensional data
> directly in the database using a new VectorField model field, VectorIndex
> support for similarity search, and ORM expressions for vector operations.
>

>
> == Features Included:
>
> **VectorField model field:
>
> 1. Accepts optional dimensions, storage_format, and storage_type
> arguments.
>
> 2. Supports Dense and Sparse vector storage.
>
> 3. Auto-converts lists, NumPy arrays, and oracledb.SparseVector for
> insert/update.
>

> **Vector Index support:
>
> 1. VectorIndex class using Meta.indexes.
>
> 2. Support for HNSW and IVF index types.
>
> 3. Optional parameters: distance, accuracy, parallel, etc.
>
> **Vector distance expressions and lookups:
>
> 1. Custom Func class VectorDistance for VECTOR_DISTANCE(lhs, rhs, metric)
>
> 2. CosineDistance, EuclideanDistance, and NegativeDotProduct etc. as
> lookups.
>
> 3. Query syntax via filter() and order_by() for similarity search.
>

> == Testing:
>
> 1. Dense and Sparse vector insert/query tests added.
>
> 2. Stress test scripts for repeated inserts/queries included.
>
> == Example:
>
> {{{
> from django.db import models
> VectorIndex = model.VectorIndex
> VectorDistanceType = models.VectorDistanceType
> VectorIndexType = models.VectorIndexType
>
> class Product(models.Model):
>     name = models.CharField(max_length=100)
>     embedding = models.VectorField(dim=3,
> storage_format=VectorStorageFormat.FLOAT32,
> storage_type=VectorStorageType.DENSE)
>
>     class Meta:
>         indexes = [
>             VectorIndex(
>                 fields=["embedding"],
>                 name="vec_idx_product",
>                 index_type=VectorIndexType.HNSW,
>                 distance=VectorDistanceType.COSINE,
>             )
>         ]
>
> }}}
>
> And a Similarity search can be performed
>
> {{{
> query_vector = array.array("f", [1.0, 2.0, 3.0])
>
>     products = Product.objects.annotate(
>         score=VectorDistance(
>             "embedding",
>             query_vector,
>             metric=VectorDistanceType.COSINE,
>         )
>     ).order_by("score")[:5]
> }}}
>

> == Implementation Status
> We have already implemented:
>
> 1. Custom VectorField with support for DENSE and SPARSE formats
>
> 2. Automatic SQL generation for model/table creation
>
> 3. VectorIndex support with customizable parameters and distance metrics
>
> 4. ORM expressions and lookups for vector distance queries (e.g.,
> CosineDistance, EuclideanDistance)
>
> 5. Basic tests for dense vector creation, insertion, indexing, and
> querying
>
> 6. Integration with Oracle’s Python driver (oracledb) for runtime
> behavior
>
> == PR Readiness
>
> We have finalized the major components of this feature and are ready to
> open a public pull request after community feedback or approval of this
> feature proposal.

New description:

 This feature adds native support for the Oracle Database’s
 [https://docs.oracle.com/en/database/oracle/oracle-database/23/vecse
 /overview-ai-vector-search.html Vector data type] introduced in Oracle
 23c. It enables AI and ML applications to store and query high-dimensional
 data directly in the database using a new VectorField model field,
 VectorIndex support for similarity search, and ORM expressions for vector
 operations.



 == Features Included:

 **VectorField model field:

 1. Accepts optional dimensions, storage_format, and storage_type
 arguments.

 2. Supports Dense and Sparse vector storage.

 3. Auto-converts lists, NumPy arrays, and oracledb.SparseVector for
 insert/update.


 **Vector Index support:

 1. VectorIndex class using Meta.indexes.

 2. Support for HNSW and IVF index types.

 3. Optional parameters: distance, accuracy, parallel, etc.

 **Vector distance expressions and lookups:

 1. Custom Func class VectorDistance for VECTOR_DISTANCE(lhs, rhs, metric)

 2. CosineDistance, EuclideanDistance, and NegativeDotProduct etc. as
 lookups.

 3. Query syntax via filter() and order_by() for similarity search.


 == Testing:

 1. Dense and Sparse vector insert/query tests added.

 2. Stress test scripts for repeated inserts/queries included.

 == Example:

 {{{
 from django.db import models
 VectorIndex = model.VectorIndex
 VectorDistanceType = models.VectorDistanceType
 VectorIndexType = models.VectorIndexType

 class Product(models.Model):
     name = models.CharField(max_length=100)
     embedding = models.VectorField(dim=3,
 storage_format=VectorStorageFormat.FLOAT32,
 storage_type=VectorStorageType.DENSE)

     class Meta:
         indexes = [
             VectorIndex(
                 fields=["embedding"],
                 name="vec_idx_product",
                 index_type=VectorIndexType.HNSW,
                 distance=VectorDistanceType.COSINE,
             )
         ]

 }}}

 And a Similarity search can be performed

 {{{
 query_vector = array.array("f", [1.0, 2.0, 3.0])

     products = Product.objects.annotate(
         score=VectorDistance(
             "embedding",
             query_vector,
             metric=VectorDistanceType.COSINE,
         )
     ).order_by("score")[:5]
 }}}


 == Implementation Status
 We have already implemented:

 1. Custom VectorField with support for DENSE and SPARSE formats

 2. Automatic SQL generation for model/table creation

 3. VectorIndex support with customizable parameters and distance metrics

 4. ORM expressions and lookups for vector distance queries (e.g.,
 CosineDistance, EuclideanDistance)

 5. Basic tests for dense vector creation, insertion, indexing, and
 querying

 6. Integration with Oracle’s Python driver (oracledb) for runtime behavior

 == PR Readiness

 We have finalized the major components of this feature and are ready to
 open a public pull request after community feedback or approval of this
 feature proposal.

--
-- 
Ticket URL: <https://code.djangoproject.com/ticket/36517#comment:1>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/django-updates/0107019831bf0284-391fa7f5-6d04-4055-b4bc-5e8e909a5762-000000%40eu-central-1.amazonses.com.

Reply via email to