diff --git a/codelabs/get-started-with-vector-db-0/index.md b/codelabs/get-started-with-vector-db-0/index.md index f6adbb4..50a66f6 100644 --- a/codelabs/get-started-with-vector-db-0/index.md +++ b/codelabs/get-started-with-vector-db-0/index.md @@ -10,6 +10,8 @@ Feedback Link: https://github.com/milvus-io/milvus # Getting Started with Vector Databases - Introduction to Unstructured Data +## Introduction + Data is a key driver of both worldwide integration as well as the global economy. From heart rate monitors worn on wrists to GPS positions of a vehicle fleet to videos uploaded to social media, data is being generated at an exponentially increasing rate. The importance of this ever-increasing amount of data cannot be understated; data can help better serve existing customers, identify supply chain weaknesses, pinpoint workforce inefficiencies, and help companies identify and break into new markets. IDC predicts that the _global datasphere_ - a measure of the total amount of new data created and stored on persistent storage all around the world - will grow to 400 zettabytes (a zettabyte = 1021 bytes) by 2028. At that time, over 30% of said data will be generated in real-time, while 80% of all generated data will be _unstructured_. diff --git a/codelabs/get-started-with-vector-db-1/index.md b/codelabs/get-started-with-vector-db-1/index.md index 0adbe3e..da90240 100644 --- a/codelabs/get-started-with-vector-db-1/index.md +++ b/codelabs/get-started-with-vector-db-1/index.md @@ -10,6 +10,8 @@ Feedback Link: https://github.com/milvus-io/milvus # Getting Started with Vector Databases - What is a Vector Database? +## Introduction + In the previous tutorial, we took a quick look at the ever-increasing amount of data that is being generated on a daily basis. We then covered how these bits of data can be split into structured/semi-structured and unstructured data, the differences between them, and how modern machine learning can be used to understand unstructured data through embeddings. Finally, we briefly touched upon unstructured data processing via ANN search. Through all of this information, it's now clear that the ever-increasing amount of unstructured data requires a paradigm shift and a new category of database management system - the vector database. ## Vector databases from 1000 feet diff --git a/codelabs/get-started-with-vector-db-2/index.md b/codelabs/get-started-with-vector-db-2/index.md index 11bc5dc..999eecf 100644 --- a/codelabs/get-started-with-vector-db-2/index.md +++ b/codelabs/get-started-with-vector-db-2/index.md @@ -10,6 +10,8 @@ Feedback Link: https://github.com/milvus-io/milvus # Getting Started with Vector Database - Introduction to Milvus +## Introduction + In the previous tutorial, we took a quick tour of vector databases and listed the features an ideal vector database should implement. We then compared vector databases to vector search libraries[^1] and vector search plugins[^2]. Through example code, we found that neither vector search libraries nor vector search plugins fulfill all of the features required to store, index, and search across large datasets of unstructured data. This prompted us to go over some of the technical challenges vector database developers face. ## Milvus history