Skip to main content

Search and Filter Documents

Learn how to find documents in your project using different search methods including text search, semantic search, and advanced filtering. Choose the right search approach based on your needs.

Audience

This guide is designed for construction professional.

Steps

Step 1: Choose Your Search Method

Agent-Con provides three search approaches: Basic Filtering for browsing and filtering by metadata, Text Search for keyword matching in titles and references, and Semantic Search for intelligent content-based search using AI embeddings. Select the method that best fits your needs.

tip

Basic filtering is fastest for known criteria. Text search works well for specific terms. Semantic search is best for finding documents by meaning or related concepts.

Step 2: Basic Document Filtering (GET /api/v1/documents)

Use the Documents API to list and filter documents by metadata. This endpoint supports filtering by project_id (required), document_type (comma-separated, e.g., 'DRAWING,SCHEDULE'), processing_status, date ranges (created_at_start, created_at_end), and pagination (page, size). Set include_images=true to get document thumbnails with configurable image_expiration in seconds.

tip

This is the most efficient method when you know the document type or date range. Results are paginated and include total count for building UI pagination controls.

Step 3: Text-Based Document Search (GET /api/v1/documents/search)

Use the Documents Search API for keyword-based search. Pass your search query which searches across document title, description, and document_reference fields. Filter by project_id, organization_id, document_type, status, category, or uploaded_by. Use start_date and end_date for date ranges, and control results with skip, limit, sort_by, and sort_order parameters.

tip

Text search performs exact and partial matching on document metadata. Use this when searching for specific document numbers, titles, or references.

Step 4: Hybrid Content Search (POST /api/v1/document-processing/search)

Use the Document Processing Search API for advanced hybrid search that combines keyword matching with vector similarity. Submit a POST request with a SearchQuery object containing your search text. This searches within processed document chunks and returns results grouped by document with relevant excerpts and similarity scores.

tip

Hybrid search uses embeddings to find semantically similar content, not just exact matches. Documents must have processing_status=ACCEPTED to be included in results.

Step 5: Cross-Entity Semantic Search (GET /api/v1/search)

Use the Search API for the most powerful search across multiple entity types. Set use_semantic_search=true to enable AI-powered similarity search. Specify entity_types array (e.g., ['DOCUMENT', 'PROJECT', 'TASK']) to search across different data types. Filter by project_id, date ranges, status, and processing_status. Results are paginated with page and page_size parameters.

tip

Semantic search finds documents by meaning, not just keywords. It understands synonyms and related concepts. Use processing_status filter to exclude unprocessed documents.

Step 6: Filter by Processing Status

When using any search method, filter by document processing_status to control result quality. Values include: PENDING (uploaded but not processed), PROCESSING (currently being analyzed), ACCEPTED (successfully processed and searchable), FAILED (processing error), or NOT_NEEDED (processing skipped). For semantic search, use ACCEPTED status to ensure documents have embeddings.

tip

Only ACCEPTED documents have embeddings for semantic search. FAILED documents may need reprocessing via POST /api/v1/documents/{document_id}/reprocess endpoint.

Step 7: Handle Search Results

Process the response based on your search method. Basic filtering returns PaginatedResponseDocumentReturnBase with items array and pagination metadata. Text search returns Array<DocumentReturnBase>. Hybrid search returns a dictionary of document chunks grouped by document ID. Semantic search returns SearchResponse with results grouped by entity type and relevance scores.

tip

All search results include document metadata. Hybrid and semantic search include relevance scores for ranking. Use document IDs to fetch full details or generate download links.

After finding documents, use GET /api/v1/documents/{document_id}/download-link to generate temporary signed URLs for accessing files. For multiple documents, use POST /api/v1/documents/batch-download with an array of document IDs to get multiple download links in one request. Links expire after the configured time.

tip

Download links are temporary and expire for security. The response includes the expiration timestamp. Re-generate links if they expire before use.

Video Tutorial


Last updated: 2025-12-11