Donnez vie à vos documents numériques !
 

Extract Highlighted Text in a pdf file and export those text to a other pdf file

abracadabraPDF Forums PDF – Général Extract Highlighted Text in a pdf file and export those text to a other pdf file

  • Créateur
    Sujet
  • #46651
    Rakesh Kumar
    Participant

    Sir,

    My requirement:

    User highlights the text in the file.
    He selects to run batch script or button Script (whichever is possible)
    all the comments are exported to a different pdf file in the below format:

    -some title-
    Page Number – Highlighted text1
    Page Number – Highlighted text2
    .
    .
    .
    page Number – Highlighted Text n.

    I already have batch script from Mr. Thom Parker which is available on the internet for doing the same but it exports it in xls, format but my requirement is to have it in a pdf format.
    Hope , I can get any help .
    Thanks in advance.
    @Merlin

Affichage de 30 réponses de 1 à 30 (sur un total de 30)
  • Auteur
    Réponses
  • #69576
    bebarth
    Maître des clés

    Hi,
    It’s a quite big job, but if the Thom’s script is what you wish except the exported format, that must be simplest to adapt it.
    Please share his script.
    @+
    :bonjour:
    PS I will be back in the next week.

    #69577
    Rakesh Kumar
    Participant

    Sir,

    Attached is the batch script i found from Mr. Thom Parkers blog …. I need few changes in that.

    1) the exported file should be in PDF format
    2) it should not be in attachments. ,, the file which generates should be the actual file without summary (for now).
    3) the format should be:

      -Some Heading –

    Page No. – Type of annotation (this is optional) – Text1
    Page No. – Type of annotation (optional)- Text2
    and so on ..
    also if possible some where in the page the time and date when the file is generated (this should i be able to hide / show according to my future requirement).

    I have one other script i found on Adobe Acrobat official forum which does exactly which i want but it prints the data in console not in pdf. below is that script hope that also helps

    Adobe script in after the thom’s script.

    Thanks
    @Bebarth

    #69578
    bebarth
    Maître des clés

    Hi,
    Could you also share the url of the blog where you found the example.
    Thanks.
    @+
    :bonjour:

    #69579
    Rakesh Kumar
    Participant

    the batch script i found at : https://acrobatusers.com/actions-exchange/        go to this page and select acrobat XI  and look for Create comment summary.
    XI
    and the other one i didnot remember as i had saved it a long ago.

    #69580
    Merlin
    Maître des clés

    You should try the COMMENTS REPORT plugin which is part of the free abracadabraTools :
    https://www.abracadabrapdf.net/utilitaires/utilities-in-english/abracadabratools_en/

    It can be customized (I can give you the source JS file) to works with Acrobat Pro, so comments will be exported as a new “Report”, which is a regular PDF file.

    See:
    https://opensource.adobe.com/dc-acrobat-sdk-docs/acrobatsdk/html2015/Acro12_MasterBook/JS_API_AcroJS/Report.htm?rhhlterm=report&rhsyns=%20
    &
    https://opensource.adobe.com/dc-acrobat-sdk-docs/acrobatsdk/html2015/Acro12_MasterBook/JS_API_AcroJS/Report_methods.htm?rhhlterm=report&rhsyns=%20

    #69581
    Rakesh Kumar
    Participant

    @Merlin 

    Sir,
    I want the comments report that works on acrobat pro as well,  in this tool it says only for acrobat Reader.

    Thanks

    #69582
    bebarth
    Maître des clés

    Hi,
    In fact I didn’t understand exactly your request.
    I thought you wanted recovered the highlighted texts, that‘s why I told you it’s a quite big job.
    I never used the Report method, so that should be great if Merlin could share his source script (by this post or in my personal email).
    After a quick look on this method, I don’t know if it is possible to set the annot type you want the report.
    It seems you have a report for all types…
    Anyway it is possible to do that in an other way with a script.
    Do you want the report in a single field? Let me know.
    @+
    :bonjour:

    #69583
    Rakesh Kumar
    Participant

    Sir,

    Request you to please edit the Mr. Thom’s script except for changing the export file as *.xls to *.pdf  and make it a standalone file and not as an attachment.

    I want all the highlighted text, commented text  to be in the reports in the following format:

    -Some Heading on top of page-

    Page No.  –  Annotation type  –  Text1      (here annotation type is optional, i.e make it such a way that they can be commented out in script if necessary)
    Page No.  –  Annotation type  – Text2
    Page No.  –  Annotation Type – Text3
    and so on….
    doesnot matter if this is in single field or multiple field
    Thanks

    #69584
    bebarth
    Maître des clés

    Hi,
    So, that’s what I understood the first time, but that is not what Thom or Merlin scripts are doing!!!
    Both scripts don’t give you the highlighted text and it’s a big job to do that…
    It’s an interesting problem, but not sure I will be able to have a look on it at the moment. I will let you know.
    @+
    :bonjour:

    #69585
    Rakesh Kumar
    Participant

    the second script in the text file gives us the highlighted text also  (plz check the text file i sent you the script after Thoms script it is a menu type option.  it gives me the highlighted , commented etc text.. but it prints the texts in the console…… plz check that script.  My requirement is exactly the same as given in that script but instead of printing in console it should be in pdf

    #69586
    Merlin
    Maître des clés

    Here is the script, I removed the “Reader only” condition.
    Interesting things are beetween lines 55 and 90.

    Code:
    // abracadabraTools DC – Comments report – Lister les commentaires
    //
    if (app.viewerVariation == “Reader” && app.formsVersion > 9) {
    if (app.language == “FRA”) {
    var strRapComment00 = “Lister tous les commentaires du document dans la Console”; // tooltip bouton
    var strRapComment01 = “Lister les commentaires…”; // menu
    var strRapComment01b = “Commentaires liste”; // bouton
    var strRapComment02 = “Choisir un mode de classement”;
    // var strRapComment03 = “Aucun”;
    var strRapComment04 = “Page”;
    var strRapComment05 = “Auteur”;
    var strRapComment06 = “Date”;
    var strRapComment07 = “Type”;
    var strRapComment08 = “LISTE DES COMMENTAIRES DU DOCUMENT : “;
    var strRapComment09 = “Aucun commentaire n’a u00E9tu00E9 du00E9tectu00E9 dans ce document”;
    }
    else {
    var strRapComment00 = “Report all document comments in the Console”; // tooltip bouton
    var strRapComment01 = “Comments Report…”; // menu
    var strRapComment01b = “Comments Report”; // bouton
    var strRapComment02 = “Select a sort type”;
    // var strRapComment03 = “None”;
    var strRapComment04 = “Page”;
    var strRapComment05 = “Author”;
    var strRapComment06 = “Date”;
    var strRapComment07 = “Type”;
    var strRapComment08 = “LIST OF COMMENTS FOR DOCUMENT:”;
    var strRapComment09 = “No comments were detected in this document”;
    }
    }
    //
    ////////////////////////////////////////////////////////////////////////////////////////////////////
    //
    if (app.formsVersion > 9) {
    // ajout du menu
    // si le menu abracadabraTools n’existe pas déjà
    var strNomMenu = “abracadabraTools u002A”;
    if (global.aTmenu != 1) {
    app.addSubMenu({ cName: strNomMenu, cParent: “Edit”, nPos: 0});
    app.addMenuItem({ cName: “-“, cParent: “Edit”, nPos: 0, cEnable: false, cExec:null});
    global.aTmenu = 1;
    }
    app.addMenuItem({ 
    cName: “Commentaires_rapport”, 
    cUser: strRapComment01, 
    cParent: strNomMenu, 
    cExec: “listeDesComments(this)”,
    nPos: 0,
    cEnable: “event.rc = app.doc;” 
    }); 
    //
    //
    listeDesComments = app.trustedFunction (function (doc) {
    var aClassement = [];
    // Prompt the user for the sorting type 
    aClassement.push({cName: strRapComment02, bEnabled: false}); 
    aClassement.push({cName: “-“}); 
    // aClassement.push({cName: strRapComment03, cReturn: ANSB_None}); 
    aClassement.push({cName: strRapComment04, cReturn: ANSB_Page}); 
    aClassement.push({cName: strRapComment05, cReturn: ANSB_Author}); 
    aClassement.push({cName: strRapComment06, cReturn: ANSB_ModDate}); 
    aClassement.push({cName: strRapComment07, cReturn: ANSB_Type}); 
    var nSortType = app.popUpMenuEx.apply(app, aClassement) || ANSB_None; 
    // 
    // Change to true to reverse the sort order 
    var bReverseOrder = false; 
    //
    doc.syncAnnotScan(); 
    var a = doc.getAnnots({nSortBy: nSortType, bReverse: bReverseOrder}); 
    // 
    if (a) {
    var msg = “Page %s par %s le %s”;
    console.clear();
    console.show();
    console.println(strRapComment08 + this.documentFileName + “rrr”);
    for (var i = 0; i console.println(util.printf(msg, 1 + a.page, a.author, util.printd(“yyyy/mm/dd HH:MM:ss”, a.creationDate)));
    console.println(a.contents + “rr”);
    }
    }
    else {
    console.clear();
    console.show();
    console.println(strRapComment08 + this.documentFileName + “]rr”);
    console.println(strRapComment09);
    // app.alert({cMsg: crop13, cTitle: strTitreId, oCheckbox: oCase, nIcon: 2, nType: 2})
    }
    });
    //
    ////////////////////////////////////////////////////////////////////////////////////////////////////
    //
    // icône comments report
    var strIconComReport = “ffffffffffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffffffffffffffffffb40000ffb40000ffb40000ffb40000ffb40000ffffffffffb40000ffb40000ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffb40000ffb40000ffb40000ffb40000ffb40000ffffffffffb40000ffb40000ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffb40000ffb40000ffffffffffb40000ffb40000ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffb40000ffb40000ffffffffffb40000ffb40000ffffffffffffffffffffffffffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffffffffffffffffffb40000ffb40000ffffffffffb40000ffb40000ffffffffffffffffffffffffffb40000ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffb40000ffffffffffffffffffffffffffffffffffffffffffb40000ffb40000ffffffffffffffffffffffffffb40000ffffffffffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffffffffffb40000ffffffffffffffffffffffffffffffffffffffffffb40000ffb40000ffffffffffffffffffffffffffb40000ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffb40000ffffffffffffffffffffffffffb40000ffb40000ffb40000ffb40000ffffffffffffffffffffffffffb40000ffffffffffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffffffffffb40000ffffffffffffffffffffffffffb40000ffb40000ffb40000ffb40000ffffffffffffffffffffffffffb40000ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffb40000ffffffffffffffffffffffffffb40000ffb40000ffb40000ffb40000ffffffffffffffffffffffffffb40000ffffffffffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffffffffffb40000ffffffffffffffffffffffffffb40000ffb40000ffb40000ffb40000ffffffffffffffffffffffffffb40000ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffb40000ffffffffffffffffffffffffffb40000ffb40000ffb40000ffb40000ffffffffffffffffffffffffffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffffffffffffffffffffffffffb40000ffb40000ffb40000ffb40000ffffffffffffffffffffffffffffffffffffffffffffffffffb40000ffb40000ffb40000ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffb40000ffb40000ffb40000ffb40000ffffffffffffffffffffffffffffffffffffffffffffffffffb40000ffb40000ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffb40000ffb40000ffb40000ffb40000ffffffffffffffffffffffffffffffffffffffffffffffffffb40000ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffb40000ffb40000ffb40000ffb40000ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffffffffffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffb40000ffffffff”;
    //
    ////////////////////////////////////////////////////////////////////////////////////////////////////
    //
    // ajout du bouton
    var oIconComReport = {count: 0, width: 20, height: 20, read: function(nBytes) {return strIconComReport.slice(this.count, this.count += nBytes);}};
    var comReportBouton =
    {cName: “Commentaires-rapport”,
    cExec: “listeDesComments(this)”,
    cEnable: “event.rc = event.target != null”,
    cMarked: “event.rc = false”,
    cTooltext: strRapComment00,
    oIcon: oIconComReport,
    cLabel: strRapComment01b};
    //
    try{app.removeToolButton(“comReport”);} catch(e){}
    try{app.addToolButton(comReportBouton);} catch(e){}
    //
    ////////////////////////////////////////////////////////////////////////////////////////////////////
    ////////////////////////////////////////////////////////////////////////////////////////////////////
    ////////////////////////////////////////////////////////////////////////////////////////////////////
    ////////////////////////////////////////////////////////////////////////////////////////////////////
    ////////////////////////////////////////////////////////////////////////////////////////////////////
    //
    // menu Aide
    // si le menu Aide abracadabraTools n’existe pas déjà
    if (global.aTmenAide != 1) {
    if (app.language == “FRA”) {
    var strMenAide00 = “abracadabraTools”;
    var strMenAide01 = “Support & assistance…”;
    // var strMenAide02 = “Tu00E9lu00E9chargement”;
    var strMenAide02 = “Actualisation…”;
    var strMenAide03 = “https://www.abracadabrapdf.net/?p=111”;
    var strMenAide04 = “Site web”;
    var strMenAide05 = “https://www.abracadabrapdf.net/”;
    }
    else {
    var strMenAide00 = “abracadabraTools”;
    var strMenAide01 = “Support & Assistance…”;
    var strMenAide02 = “Check for update…”;
    var strMenAide03 = “https://www.abracadabrapdf.net/?p=972”;
    var strMenAide04 = “Web Site”;
    var strMenAide05 = “https://www.abracadabrapdf.net/?p=1591”;
    }
    // AJOUT DU MENU
    app.addMenuItem({ cName: “-“, cParent: “Help”, nPos: 21, cEnable: false, cExec:null});
    app.addSubMenu({ cName: strMenAide00, cParent: “Help”, nPos: 22});
    app.addMenuItem({ cName: strMenAide04, cParent: strMenAide00, nPos: 0, cExec: “app.launchURL(strMenAide05);”,});
    app.addMenuItem({ cName: strMenAide02, cParent: strMenAide00, nPos: 1, cExec: “app.launchURL(strMenAide03);”,});
    app.addMenuItem({ cName: strMenAide01, cParent: strMenAide00, nPos: 2, cExec: “app.launchURL(‘https://abracadabrapdf.net/forum/’);”,});
    //
    // altération de la variable
    global.aTmenAide = 1;
    }
    //
    ////////////////////////////////////////////////////////////////////////////////////////////////////
    //
    }
    //
    ////////////////////////////////////////////////////////////////////////////////////////////////////
    #69587
    Rakesh Kumar
    Participant

    @Merlin

    Sir,
    I copied the script but the same is not showing in the acrobat pro

    #69588
    bebarth
    Maître des clés

    Hi,

    the second script in the text file gives us the highlighted text also…

    You are rigth, I will have a look soon.
    @+
    :bonjour:

    #69589
    Rakesh Kumar
    Participant

    The second script is a menu item, but i don’t know if it works for both pro and reader … I tested on pro .. it works on pro but not sure about reader ….. will be helpful if it works on both.

    #69590
    bebarth
    Maître des clés

    Hi,
    If you wish add a field to display the result and save this result that will only be done with Acrobat Pro.
    That seems rather simple. I’ll send you the new script tomorrow.
    @+
    :bonjour:

    #69591
    Merlin
    Maître des clés

    Save this script as a .js file and place it in your Acrobat’s JavaScripts folder.

    #69592
    Rakesh Kumar
    Participant

    @Merlin

    Sir,

    I did the same , but still no menu is visible under Edit.

    Thanks

    #69593
    Merlin
    Maître des clés

    Download this attachment and install it.
    It’s a version made for you.

    #69594
    Rakesh Kumar
    Participant

    Still no menu showing up under the edit menu

    #69595
    bebarth
    Maître des clés

    Hi,
    Here is the adaptation I did (modifications are indicated after the lines // bebarth). I tried to create a new page and save the file but couldn’t! I think it’s a problem of trustedFunction.
    If you find how to do that, let me know… Merlin, an idea?
    Change the extension of the attached file from .txt to .js then place the file in the JavaScript folder of your Acrobat Pro.
    The script will indicate the words highlighted, but don’t forget the note indicated in the original script:

    Code:
    * Text returned may not always match exactly the text covered by the highlight.
    * This is mainly dependent upon two things: 1) whether a word is adjacent to
    * punctuation or not, and 2) whether whole words or partial wordsare highlighted
    * or not.

    …and then, you can execute from the console:

    Code:
    this.deletePages({nStart: 1, nEnd: this.numPages-1});
    this.saveAs(this.documentFileName.replace(/.pdf/i, ” (Highlighted Texts in a Field).pdf”));

    @+
    :bonjour:

    #69596
    Rakesh Kumar
    Participant

    Sir,

    This isn’t working,  when i run the script from the console it deletes all pages and left with first page of the document  .

    #69597
    bebarth
    Maître des clés

    Hi,
    Did you change the .txt extension of the file in .js then placed it in the JavaScript folder of Acrobat ?
    You also have to restart Acrobat then you execute the “Copy Annotated Words – bebarth Version” menu.
    After saving with the 2-line script from the console, you should get a file like the one attached!
    @+
    :bonjour:

    #69598
    Rakesh Kumar
    Participant

    Yes, i did exactly the same, i am the getting menu also,, but when i click the menu the highlighted text  is shown in console, and after i copy your 2 line script to console and run it deletes all the pages except for first page of the original file and save it save As with just the content of the first page of the original file .

    #69599
    bebarth
    Maître des clés

    Do you have exactly the same menu with “bebarth Annotation”?
    Did you remove from the JS folder the old file?
    @+
    :bonjour:

    #69600
    Merlin
    Maître des clés

    Rakesh Kumar, why can’t you use the Acrobat Pro embedded tool “Create Comment Summary” ?
    It look like we are trying to reinvent the wheel.

    #69601
    bebarth
    Maître des clés

    Merlin,
    Rakesh Kumar doesn’ want recover the comments but the highlighted texts!
    @+
    :bonjour:

    #69602
    Rakesh Kumar
    Participant

    Sir,

    I want both comments , highlighted text  to be extracted.
    @Berlin Sir,  i didnot remove the old js file.

    Thanks

    #69603
    bebarth
    Maître des clés

    Hi,
    So, is that working better?
    This script only extract the text, not the comments!
    I’m still in vacation this week  :soleil: . I will have a look on next week…
    @+
    :bonjour:

    #69604
    Merlin
    Maître des clés

    I want both comments , highlighted text  to be extracted.

    So I repeat my question  :

    Rakesh Kumar, why can't you use the Acrobat Pro embedded tool “Create Comment Summary” ?
    It look like we are trying to reinvent the wheel.

    #70341
    bebarth
    Maître des clés

    Hi,

    I’ve just finished a script that should meet this request.
    Thanks to let me know if you or somebody else is interested by this script.

    @+
    😎

Affichage de 30 réponses de 1 à 30 (sur un total de 30)
  • Vous devez être connecté pour répondre à ce sujet.