Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the embed_content in RecommendationsAdapter #4455

Closed
Tracked by #4044
akolson opened this issue Feb 26, 2024 · 1 comment
Closed
Tracked by #4044

Update the embed_content in RecommendationsAdapter #4455

akolson opened this issue Feb 26, 2024 · 1 comment
Assignees
Labels

Comments

@akolson
Copy link
Member

akolson commented Feb 26, 2024

Overview

This task involves updating embed_content in RecommendationsAdapter to be able to get all file URLs for each node from which textual content will be extracted.

Description and outcomes

  • Update the embed_content in RecommendationsAdapter
    • The embed_content accepts a list of nodes(ContentNode) as a parameter.
    • For each node, find all file URLs to be extracted
    • Use kind and preset fields in ContentNode and File models respectively to determine which file URLs to extract.
    • Currently, all studio files are store in this bucket
  • Finding file URLs
    • Audio files (mp3)
      • Return the corresponding URL(s)
    • Video files (mp4, web)
      • Return the corresponding subtitle URL(s) if they exist, else return corresponding URL(s) for the actual video files
    • HTML files (html5)
      • HTML files are uploaded as zip files and extracted into this bucket
      • Return the corresponding URL(s) of the extracted zip location.
    • H5P files (h5p)
      • Return the corresponding URL(s)
    • ZIM files (zim)
      • Return the corresponding URL(s)
    • Document files (pdf, epub)
      • Return the corresponding URL(s)
    • Exercise files (Perseus)
      • Return the corresponding URL(s)
  • Making a request
    • Make a request to the recommendations backend. For example
      body = {
         'resources': resources,
         'metadata': {}
      }
      embed_content_request = EmbedContentRequest(
         headers={},  # Leaving this to allow for passing of headers to external api
         params={},  # Same for this
         body=body
      )
      return self.backend.make_request(embed_content_request)
      
      Where resources is the updated embed_content_request.json

Acceptance Criteria

  • All file URLs associated with the passed nodes in embed_content are extracted correctly

Assumptions and Dependencies

Scope

The scope of this task is limited to;

  • Updating embed_content to gather all file URLs required for content extraction.

Accessibility Requirements

NA

Resources

@akolson akolson changed the title [WIP - DO NOT ASSIGN]: Implement the extract_content method to extract content from a content node [WIP - DO NOT ASSIGN]: Update the embed_content in RecommendationsAdapter Apr 15, 2024
@akolson akolson changed the title [WIP - DO NOT ASSIGN]: Update the embed_content in RecommendationsAdapter Update the embed_content in RecommendationsAdapter Apr 22, 2024
@akolson akolson mentioned this issue Jul 15, 2024
24 tasks
@akolson
Copy link
Member Author

akolson commented Aug 12, 2024

Closed by #4604

@akolson akolson closed this as completed Aug 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant