Skip to content

mgleavitt/my_directory_loader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

my_directory_loader

This is a single-file GitHub repository that contains a modified version of LangChain's DirectoryLoader class. It allows you to load a directory of documents into a DocumentIndex object, and work around an apparent bug in Unstructured's support for PDF files that causes kernel crashes in Jupyter and segmentation faults when executed in .py files.

The only difference from LangChain's DirectoryLoader is that this version offers a separate loader class, pdfloader_cls, which defaults to PyPDFLoader (a wrapper around PyPDF2). By default, all other file types go through UnstructuredFileLoader.

For more details on the parameters and returned values of DirectoryLoader, please refer to LangChain's documentation:

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages