Project

General

Profile

Actions

Feature #36743

closed

Use text extraction services to get file content

Added by Ingo Renner over 12 years ago. Updated about 7 years ago.

Status:
Closed
Priority:
Must have
Assignee:
Category:
File Abstraction Layer (FAL)
Target version:
Start date:
2012-05-01
Due date:
% Done:

100%

Estimated time:
PHP Version:
Tags:
Complexity:
Sprint Focus:

Description

Currently FAL simply uses file_get_contents() in its local driver to extract a file's content. This is fine for simple text files, but won't work for file types like Office and PDF files.

TYPO3 already offers the services infrastructure to allow having different text extractors. Use the textExtract service to read file contents.

Actions #1

Updated by Gerrit Code Review over 12 years ago

  • Status changed from New to Under Review

Patch set 1 for branch master has been pushed to the review server.
It is available at http://review.typo3.org/10916

Actions #2

Updated by Gerrit Code Review over 12 years ago

Patch set 2 for branch master has been pushed to the review server.
It is available at http://review.typo3.org/10916

Actions #3

Updated by Alexander Opitz over 11 years ago

  • Assignee changed from Ingo Renner to Andreas Wolf
  • Target version deleted (6.0.0)

What is the state of text extraction services? You metioned in gerrit that there are other plans to implement this.

Actions #4

Updated by Alexander Opitz almost 10 years ago

  • Target version set to 7.1 (Cleanup)
  • Sprint Focus set to On Location Sprint
Actions #5

Updated by Alexander Opitz almost 10 years ago

  • Category set to File Abstraction Layer (FAL)
Actions #6

Updated by Frans Saris almost 10 years ago

  • Status changed from Under Review to Needs Feedback

You can create your own extractor service to process a file to get the readable content of a file just like is possible for metadata.

In you extractor you call $file->getForLocalProcessing(); to get the path to the real file (or temp local copy of it) and do your magic to fetch the text.

Actions #7

Updated by Fabien Udriot almost 10 years ago

A source of inspiration could be in EXT:metadata where we retrieve custom metadata for images and pdf.

Can we close the ticket?

Actions #8

Updated by Mathias Schreiber almost 10 years ago

  • Status changed from Needs Feedback to Accepted
Actions #9

Updated by Gerrit Code Review almost 10 years ago

  • Status changed from Accepted to Under Review

Patch set 1 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at http://review.typo3.org/36556

Actions #10

Updated by Gerrit Code Review almost 10 years ago

Patch set 2 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at http://review.typo3.org/36556

Actions #11

Updated by Gerrit Code Review almost 10 years ago

Patch set 3 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at http://review.typo3.org/36556

Actions #12

Updated by Gerrit Code Review almost 10 years ago

Patch set 4 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at http://review.typo3.org/36556

Actions #13

Updated by Gerrit Code Review almost 10 years ago

Patch set 5 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at http://review.typo3.org/36556

Actions #14

Updated by Gerrit Code Review almost 10 years ago

Patch set 6 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at http://review.typo3.org/36556

Actions #15

Updated by Gerrit Code Review almost 10 years ago

Patch set 7 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at http://review.typo3.org/36556

Actions #16

Updated by Gerrit Code Review over 9 years ago

Patch set 8 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at http://review.typo3.org/36556

Actions #17

Updated by Ingo Renner over 9 years ago

  • Status changed from Under Review to Resolved
  • % Done changed from 0 to 100
Actions #18

Updated by Anja Leichsenring almost 9 years ago

  • Sprint Focus deleted (On Location Sprint)
Actions #19

Updated by Riccardo De Contardi about 7 years ago

  • Status changed from Resolved to Closed
Actions

Also available in: Atom PDF