Project

General

Profile

Actions

Bug #95725

open

Title shown twice with pdfinfo using PDF/X files

Added by Oliver Hader about 3 years ago. Updated over 2 years ago.

Status:
New
Priority:
Should have
Assignee:
-
Category:
Indexed Search
Target version:
-
Start date:
2021-10-21
Due date:
% Done:

0%

Estimated time:
TYPO3 Version:
10
PHP Version:
Tags:
Complexity:
Is Regression:
Sprint Focus:

Description

The following report has been sent to me via mail by Josef Sigritz, I'm just dumping it here:


wir haben ein Problem mit dem FileContentParser der Indexed_Search: pdfinfo gibt bei PDF/X-Dateien zweimal den Title aus. Dadurch wird der eigentliche Title überschrieben.

Beispiel:
pdfinfo test.pdf

 
*Title:          BAA010718_Broschüre_Chancen_bieten_V2.indd*
Creator:        Adobe InDesign CC 13.0 (Macintosh)
Producer:       Adobe PDF Library 15.0
CreationDate:   Thu Feb 22 15:51:27 2018 CET
ModDate:        Mon Mar 12 12:12:12 2018 CET
Tagged:         no
UserProperties: no
Suspects:       no
Form:           none
JavaScript:     no
Pages:          20
Encrypted:      no
Page size:      595.276 x 841.89 pts (A4)
Page rot:       0
File size:      2292621 bytes
Optimized:      yes
PDF version:    1.3
PDF subtype:    PDF/X-3:2002
    *Title:         ISO 15930 - Electronic document file format for prepress digital data exchange (PDF/X)*
    Abbreviation:  PDF/X-3:2002
    Subtitle:      Part 3: Complete exchange suitable for colour-managed workflows (PDF/X-3)
    Standard:      ISO 15930-3

Verbesserungsvorschlag:
Klasse: typo3/typo3/sysext/indexed_search/Classes/FileContentParser.php, function splitPdfInfo

public function splitPdfInfo($pdfInfoArray)
    {
        $res = [];
        if (is_array($pdfInfoArray)) {
            foreach ($pdfInfoArray as $line) {
                $parts = explode(':', $line, 2);
                if (count($parts) > 1 && trim($parts[0])) {
                    if (!array_key_exists(strtolower(trim($parts[0])), $res)){
                      $res[strtolower(trim($parts[0]))] = trim($parts[1]);
                    }
                    $res[strtolower(trim($parts[0]))] = trim($parts[1]);
                }
            }
        }
        return $res;
    }
Actions #1

Updated by Oliver Hader about 3 years ago

  • Description updated (diff)
Actions #2

Updated by Josef Sigritz about 3 years ago

Sorry, i forgot to comment out the original line;

   public function splitPdfInfo($pdfInfoArray)
    {
        $res = [];
        if (is_array($pdfInfoArray)) {
            foreach ($pdfInfoArray as $line) {
                $parts = explode(':', $line, 2);
                if (count($parts) > 1 && trim($parts[0])) {
                    if (!array_key_exists(strtolower(trim($parts[0])), $res)){
                      $res[strtolower(trim($parts[0]))] = trim($parts[1]);
                    }
                }
            }
        }
        return $res;
    }

Actions #3

Updated by Tomas Norre Mikkelsen over 2 years ago

How to reproduce this?
And how do I recognize the problem in TYPO3?

Do you perhaps have a pdf/x file that you could share, to each the steps to reproduce?

Actions

Also available in: Atom PDF