Мне нужна помощь по оптимизации командного файла, чтобы получить несколько тегов XML из более чем тысячи файлов XML в формате .txt или .csv.

.Xml все в одном формате. Они являются клиническими исследованиями и выглядят так:

    <?xml version="1.0" encoding="UTF-8"?>
<clinical_study rank="373">
  <!-- This xml conforms to an XML Schema at:
    https://clinicaltrials.gov/ct2/html/images/info/public.xsd -->
  <required_header>
    <download_date>ClinicalTrials.gov processed this data on May 25, 2017</download_date>
    <link_text>Link to the current ClinicalTrials.gov record.</link_text>
    <url>https://clinicaltrials.gov/show/NCT00146471</url>
  </required_header>
  <id_info>
    <org_study_id>Kep-F10.3.01</org_study_id>
    <nct_id>NCT00146471</nct_id>
  </id_info>
  <brief_title>Efficacy and Safety of Levetiracetam in the Inpatient Treatment of Alcohol Withdrawal Syndrome</brief_title>
  <official_title>Efficacy and Safety of Levetiracetam in the Inpatient Treatment of Alcohol Withdrawal Syndrome [Sicherheit Und Wirksamkeit Von Levetiracetam (Keppra) für Die Behandlung Des stationären Alkoholentzugsyndroms]</official_title>
  <sponsors>
    <lead_sponsor>
      <agency>Charite University, Berlin, Germany</agency>
      <agency_class>Other</agency_class>
    </lead_sponsor>
  </sponsors>
  <source>Charite University, Berlin, Germany</source>
  <oversight_info>
    <has_dmc>Yes</has_dmc>
  </oversight_info>
  <brief_summary>
    <textblock>
      The purpose of this study is to evaluate the efficacy and safety of levetiracetam for
      treating alcohol withdrawal syndrome (AWS) in inpatients (vs. placebo). The primary come-out
      parameter is the reduction of the total needed amount of diazepam for add-on treatment of
      acute alcohol withdrawal symptoms. The secondary come-out parameter are - safety criteria
      (AE) - reduction of alcohol withdrawal score over the days.
    </textblock>
  </brief_summary>
  <overall_status>Completed</overall_status>
  <start_date>January 2006</start_date>
  <completion_date type="Actual">September 2007</completion_date>
  <primary_completion_date type="Actual">July 2007</primary_completion_date>
  <phase>Phase 3</phase>
  <study_type>Interventional</study_type>
  <has_expanded_access>No</has_expanded_access>
  <study_design_info>
    <allocation>Randomized</allocation>
    <intervention_model>Parallel Assignment</intervention_model>
    <primary_purpose>Treatment</primary_purpose>
    <masking>Double Blind (Participant, Care Provider, Investigator)</masking>
  </study_design_info>
  <primary_outcome>
    <measure>To evaluate the efficacy and safety of levetiracetam for treating alcohol withdrawal syndrome in inpatients. The primary come-out parameter is the reduction of the amount of diazepam for add-on treatment of acute alcohol withdrawal</measure>
    <time_frame>during trial</time_frame>
  </primary_outcome>
  <secondary_outcome>
    <measure>Secondary come-out parameters are - safety criteria (AE) - reduction of alcohol withdrawal score over the days</measure>
    <time_frame>during trial</time_frame>
  </secondary_outcome>
  <number_of_arms>2</number_of_arms>
  <enrollment type="Actual">120</enrollment>
  <condition>Alcohol Withdrawal Syndrome</condition>
  <arm_group>
    <arm_group_label>2</arm_group_label>
    <arm_group_type>Active Comparator</arm_group_type>
  </arm_group>
  <arm_group>
    <arm_group_label>1: Diazepam plus Placebo</arm_group_label>
    <arm_group_type>Placebo Comparator</arm_group_type>
  </arm_group>
  <intervention>
    <intervention_type>Drug</intervention_type>
    <intervention_name>Levetiracetam</intervention_name>
    <description>1500-2000 mg daily add-on or Placebo Diazepam as needed</description>
    <arm_group_label>2</arm_group_label>
    <other_name>KEPPRA</other_name>
  </intervention>
  <intervention>
    <intervention_type>Drug</intervention_type>
    <intervention_name>Placebo</intervention_name>
    <description>1500-2000 mg daily add-on or Placebo Diazepam as needed</description>
    <arm_group_label>1: Diazepam plus Placebo</arm_group_label>
  </intervention>
  <eligibility>
    <criteria>
      <textblock>
        Inclusion Criteria:

          -  Ages eligible for study: 18-65 years.

          -  Meets criteria for alcohol dependence according to DSM-IV/ICD-10

          -  Known withdrawal symptoms in the past in case of discontinuation of alcohol
             consumption

          -  Hospital admission for alcohol detoxification

          -  Able to provide a written informed consent.

          -  Able to follow verbal and written instructions (incl. a sufficient knowledge of
             German language).

          -  Must be medically acceptable for study treatment. No past or present physical
             disorder that is likely to deteriorate during participation. No ECG abnormality which
             would likely worsen during participation and no clinical laboratory abnormality that
             would also suggest deterioration during treatment.

          -  Have a negative urine drug screen for benzodiazepines or heroine or methadone

        Exclusion Criteria:

          -  Current diagnosis of any other substance dependence syndrome other than alcohol
             dependence (excluding nicotine and caffeine dependence).

          -  History of idiopathic epilepsy.

          -  Patient with any current clinically significant psychiatric disorder (acute
             suiciality) or developmental disorder (including organic mental disorder), like
             psychotic disorders.

          -  Patients with the following complications of alcoholism (lifetime): acute delirium
             tremens, hallucinatory alcoholic state, Korsakoff`s syndrome, Wernicke
             encephalopathy, decomposed liver cirrhosis (Child B, C), suspected cirrhosis with the
             following clinical symptoms detected at clinical exam: signs of portal hypertension
             and signs of hepato-cellular failure, thrombocytopenia.

          -  Subjects with known sensitivity of previous adverse reaction to levetiracetam

          -  Contra-indication (hypersensitivity to levetiracetam or pyrrolidone derivatives) or
             known non-response to levetiracetam.

          -  History of severe GI disease which might render absorption of the medication
             difficult or produce medical instability of the patient which would include active
             peptic ulcer disease, ulcerative colitis, regional colitis, or evidence by history or
             physical exam of GI bleeding.

          -  Patients with any clinically significant acute or chronic progressive neurological,
             gastrointestinal, cardiovascular, hepatic, renal, haematological, endocrine,
             dermatological or respiratory disease, such as diabetes, severe infection, acute
             alcoholic hepatitis, or any other medical condition with significant worsening of the
             clinical situation of the patient that might interfere with the evaluation of study
             medication.

          -  Female patients pregnant, breast-feeding or of child bearing age and not protected by
             effective contraceptive such as implants, injectables, combined oral contraceptives,
             some IUDS, sexual abstinence, sterilization or vasectomized partner.

          -  Actually continuous use of pharmacological agents that are known to lower the seizure
             threshold or augment or decrease the alcohol withdrawal syndrome.

          -  Subjects with known sensitivity of previous adverse reaction to diazepam or clonidine

          -  Contra-indication or known non-response to diazepam or clonidine
      </textblock>
    </criteria>
    <gender>All</gender>
    <minimum_age>18 Years</minimum_age>
    <maximum_age>65 Years</maximum_age>
    <healthy_volunteers>No</healthy_volunteers>
  </eligibility>
  <overall_official>
    <last_name>Martin Schaefer, MD</last_name>
    <role>Principal Investigator</role>
    <affiliation>Charité Campus Mitte, Klinik für Psychiatrie und Psychotherapie</affiliation>
  </overall_official>
  <location>
    <facility>
      <name>MLU Halle-Wittenberg</name>
      <address>
        <city>Halle</city>
        <state>Sachen/Anhalt</state>
        <zip>06097</zip>
        <country>Germany</country>
      </address>
    </facility>
  </location>
  <location>
    <facility>
      <name>Charité - Universitätsmedizin Berlin, Campus Charité Mitte, Klinik für Psychiatrie und Psychotherapie</name>
      <address>
        <city>Berlin</city>
        <zip>10117</zip>
        <country>Germany</country>
      </address>
    </facility>
  </location>
  <location>
    <facility>
      <name>Psychiatrische Klinik der Charité im St.-Hedwig Krankenhaus</name>
      <address>
        <city>Berlin</city>
        <zip>10559</zip>
        <country>Germany</country>
      </address>
    </facility>
  </location>
  <location>
    <facility>
      <name>Klinik für Psychiatrie und Suchtmedizin, Kliniken Essen Mitte</name>
      <address>
        <city>Essen</city>
        <zip>45136</zip>
        <country>Germany</country>
      </address>
    </facility>
  </location>
  <location>
    <facility>
      <name>Zentrum für Seelische Gesundheit</name>
      <address>
        <city>Rhede</city>
        <zip>46414</zip>
        <country>Germany</country>
      </address>
    </facility>
  </location>
  <location_countries>
    <country>Germany</country>
  </location_countries>
  <reference>
    <citation>Krebs M, Leopold K, Richter C, Kienast T, Hinzpeter A, Heinz A, Schaefer M. Levetiracetam for the treatment of alcohol withdrawal syndrome: an open-label pilot trial. J Clin Psychopharmacol. 2006 Jun;26(3):347-9.</citation>
    <PMID>16702910</PMID>
  </reference>
  <verification_date>September 2008</verification_date>
  <lastchanged_date>December 29, 2009</lastchanged_date>
  <firstreceived_date>September 6, 2005</firstreceived_date>
  <responsible_party>
    <name_title>Martin Schaefer, MD</name_title>
    <organization>Charite University, Berlin, Germany</organization>
  </responsible_party>
  <keyword>alcohol withdrawal</keyword>
  <keyword>detoxification</keyword>
  <keyword>Inpatients</keyword>
  <keyword>alcohol dependence according to DSM-IV/ICD-10</keyword>
  <keyword>withdrawal symptoms</keyword>
  <condition_browse>
    <!-- CAUTION:  The following MeSH terms are assigned with an imperfect algorithm  -->
    <mesh_term>Syndrome</mesh_term>
    <mesh_term>Substance Withdrawal Syndrome</mesh_term>
  </condition_browse>
  <intervention_browse>
    <!-- CAUTION:  The following MeSH terms are assigned with an imperfect algorithm  -->
    <mesh_term>Ethanol</mesh_term>
    <mesh_term>Diazepam</mesh_term>
    <mesh_term>Etiracetam</mesh_term>
    <mesh_term>Piracetam</mesh_term>
  </intervention_browse>
  <!-- Results have not yet been posted for this study                                -->
</clinical_study>

Все они используют одни и те же теги, и мне нужно несколько таких, как:

  • overall_official
  • lead_sponsor
  • official_title
  • results_reference
  • overall_status

До сих пор я пытался с помощью следующего кода:

    @echo off
setlocal enabledelayedexpansion
for %%a in (*.xml) do (
call :XMLExtract "%%a" "<results_reference>" location
echo.!location!,%%~na
)
exit /b

:XMLExtract file keystart location
@echo off & setlocal
for /f "tokens=3 delims=<>" %%a in ('Findstr /i /c:%2 "%~1"') do (
   set "loc=%%a" & goto :endloop
)
:endLoop
ENDLOCAL & IF "%~3" NEQ "" (SET %~3=%loc%) ELSE echo.%loc%
exit /b

Я запустил пакет в командной строке как: bat >> output.txt или output.csv, и он отлично работал для общего_стата, но со всеми другими тегами есть проблемы, например:

  • total_offical: останавливается примерно после 10 из них
  • другие теги: имена файлов перечислены (как всегда), но без информации.

Я был бы очень признателен за любую помощь в том, как это можно исправить или другой способ эффективного решения этой задачи. У меня есть только небольшое, базовое понимание программирования, но я уверен, что смогу работать над любыми простыми решениями. Лучшая помощь - способ оптимизировать пакетный код, чтобы соответствовать этому. Если какая-то информация отсутствует, я прошу прощения и предоставлю ее.

0
Lew Myschkin 28 Май 2017 в 01:44

2 ответа

Лучший ответ
@ECHO Off
SETLOCAL
SET "sourcedir=U:\sourcedir"
SET "destdir=U:\destdir"
:: SET "tags=overall_official lead_sponsor official_title results_reference overall_status"
SET "tags=%*"

FOR /f "tokens=1delims=" %%a IN (
 'dir /b /a-d "%sourcedir%\*.xml" '
 ) DO (
 REM Clear detected-tags flags for each file "%%a"
 FOR %%t IN (%tags% malformed) DO SET "%%t="
 REM remove "rem" from following line to delete any existing result file
 REM del "%destdir%\%%~na.txt" >nul 2>nul
 REM Read each line to %%L - usebackq to allow "quoted filenames"
 FOR /f "usebackqdelims=" %%L IN ("%sourcedir%\%%a") DO (
  REM remove leading spaces from %%L into %%P
  FOR /f "tokens=*" %%P IN ("%%L") DO (
   REM tokenise on "<>"
   FOR /f "tokens=1-3*delims=<>" %%w IN ("%%P") DO (
    IF "%%z" neq "" SET "malformed=%%z"
    FOR %%t IN (%tags%) DO IF "%%w"=="%%t" (SET "%%t=Y") else IF "%%w"=="/%%t" (SET "%%t=") 
    SET "report="
    FOR %%t IN (%tags%) DO IF DEFINED %%t SET "report=Y"
    REM (1 of 2) un-rem this to deposit in individual filenames
    REM (
    IF DEFINED report (
     REM we may have 1,2 or 3 tokens
     REM if 3, output token 2
     REM if 2, output token 1 if token 2 starts "/", token 2 otherwise
     REM if only 1, output entire line unless it is a target token
     IF "%%y" equ "" (
      IF "%%x" equ "" (
       REM only one token
       FOR %%t IN (%tags%) DO IF "%%w"=="%%t" (SET "report=") else IF "%%w"=="/%%t" (SET "report=") 
       IF DEFINED report ECHO %%L
      ) ELSE (
       REM two tokens
       ECHO %%x|FINDSTR /b "/">NUL 2>NUL
       IF ERRORLEVEL 1 (ECHO %%x) ELSE (ECHO %%w)
      )
     ) ELSE (ECHO %%x)
    )
    REM (2 of 2) un-rem this to deposit in individual filenames
    REM )>>"%destdir%\%%~na.txt"
    FOR %%t IN (%tags%) DO IF "%%y"=="/%%t" (SET "%%t=") 
    FOR %%t IN (%tags%) DO IF "%%x"=="/%%t" (SET "%%t=") 
   )
  REM pause
  )
 )
)

GOTO :EOF

Вам необходимо изменить настройки sourcedir и destdir в соответствии с вашими обстоятельствами.

Это может дать вам некоторые идеи. Вы не предоставили выходной пример, поэтому, возможно, вы захотите поставить перед каждой выходной строкой префикс имени файла (в %%~na) в соответствующих echo

Ожидаемый синтаксис для запуска:

thisbatchname tag tag tag тег

Мой подход состоит в том, чтобы %%a содержал имя файла, который нужно обработать, %%L необработанные данные строки из файла и %%P необработанные данные строки с удалением начальных пробелов.

Токенизация %%P с использованием разделителей приводит к %%W к %%z, так как каждая строка содержит 1-3 возможных элемента - теги или данные. Если есть четвертый, то что-то не так (для файла установлен флаг malformed, хотя я ничего с ним не делал - он будет содержать текст, в котором заключается проблема [также можно было установить %%P для всей строки ...])

Таким образом, используя обязательные теги в качестве имен переменных, просто установите для этих переменных имена none или что-то и используйте if defined для интерпретации их состояния - которое работает при их запуске -время статус, как данные изменяются построчно.

Обратите внимание, что поскольку вся операционная часть кода представляет собой один гигантский кодовый блок, rem, а не ::, следует использовать для полезных замечаний.

Также обратите внимание, что

(
 commands
)>file

Перенаправит вывод commands в соответствии с указанным перенаправителем (если требуется)

0
Magoo 28 Май 2017 в 05:15

Попробуйте xpath.bat:

for /f "tokens=* delims=" %%# in ('xpath.bat "study.xml" "//reference/citation"') do set "reference_citation=%%#"
echo %reference_citation%

for /f "tokens=* delims=" %%# in ('xpath.bat "study.xml" "//official_title"') do set "official_title=%%#"
echo %official_title%

for /f "tokens=* delims=" %%# in ('xpath.bat "study.xml" "//lead_sponsor/agency"') do set "lead_sponsor=%%#"
echo %lead_sponsor%

for /f "tokens=* delims=" %%# in ('xpath.bat "study.xml"  "//overall_official"') do set  "overall_official=%%#"
echo %overall_official%
0
npocmaka 10 Ноя 2018 в 22:30