Changes in PubCrawler so far:

For the most recent changes, please see the download page.

Changes for PubCrawler version 1.80 (25/11/02):

  1. changed link to home-page (now: http://pubcrawler.gen.tcd.ie)
  2. added 'tool' as allowed variable and option (defaults to PubCrawler_X.X), this also allows usage of 'holding' within tool string
  3. small fixes to HTML output (tables were a bit scrambled up after Entrez changed format slightly)
  4. added option 'mail_only' (expects number of hits as argument) this option will prevent redoing the queries, but only sends the notification/e-mail instead (useful if mailing problems occured)
  5. updated the JavaScript in the head of the simulated Entrez results page
  6. added explanatory text in case initialisation exceeds getmax value
  7. moved &save_db to a later point (after mail was sent)
  8. fix to keep entries for neighbourhood search (which still doesn't work properly)
  9. added link that allows retrieval of full reports
  10. adjusted POD to reflect GPL
Thanks to feed-back from several users that helped to improve PubCrawler!

Changes for PubCrawler version 1.70 (29/11/00):

  1. added option to retry searches if server error occured (-retry <number of retries>)
  2. adjusted PubMed neighbourhood searches to changes at NCBI website
  3. disabled GenBank neighbourhood searches
  4. added logging of queries and mails (for internal use only)

Changes for PubCrawler version 1.60 (14/04/00):

  1. new option: mail_simple (mainly for WWW-Service)
    also simplifies ascii results
  2. hostname will be included as HTLM-comment in results file (UNIX systems only)
  3. fixed relentrezdate bug for years
  4. undecoded ampersand in results links
  5. suppress connection test if option 'no_test' specified
    (by default PubCrawler tries direct connection first)
  6. added comments for internet connection test

Changes for PubCrawler version 1.55 (29/03/00):

  1. new command line options:
  2. more adjustments to PubMeds new format
  3. defaults for URLs
  4. handling of NCBI errors and connection failures
  5. additional path in environment setting
  6. higher splitsize (mailing of results only)

Changes for PubCrawler version 1.41 (29/02/00):

a lot of changes occured, mostly due to adjustment to new PubMed format
version number jumped from 0.992 to 1.41 because I hope it is approaching a stable version 2 (according to PubMed format '2'):
  1. new command line options:
  2. adjustments to new PubMed format:
  3. relentrezdate:
  4. specified big splitsize for metasend
  5. improved check routine
  6. net failure reported in notification
  7. $tool includes version number
  8. allow for multiple proxies, but test direct connection first
  9. deletion of temporary mail filas
  10. splitting nickname from mail address at /#/ or /@@/ (second version works better for rsh invokement)
  11. sub replace header (for www-service use only)
  12. extended logging:

Changes for PubCrawler version 0.992 (14/10/99):

  1. added neighbourhood searches for PubMed and GenBank
  2. progress monitoring on results page during queries
  3. create backup of results before overwriting them
  4. changed homepage link to www.pubcrawler.ie
  5. improved check routine
  6. copyrighted PubCrawler by the GNU General Public License

Changes for PubCrawler version 0.99 (29/07/99):

  1. ordered variable declarations
  2. improved searching and checking for config file
  3. removed back to top link because it was faulty in many cases and the browser's back-button would do the job anyway.
  4. added some more messages for log file
  5. added decap-option to strip headers (only for experienced users)
  6. set number of hits in index to bold
  7. set "MORE" for today's linked documents to bold
  8. took out [all fields] from query (results in more hits)
  9. fixed bug with path for working directory in Windows

Changes for PubCrawler version 0.98 (08/06/99):

  1. changed PubCrawler e-mail address
  2. added option 'mail_ascii' to receive text-only mail
  3. fixed internal links containing double quiotes
  4. prepended base URL to index links

Changes for PubCrawler version 0.97 (27/04/99):

  1. updated links for new Web-server
  2. included two new options:
    mail and notify
    These are mainly intended for PubCrawler's WWW-service and only work for Unix machines, which have the program metasend installed.
  3. modified generation of base_URL
  4. keep count of overall hits in $total_hits
  5. keep copy of original query string
  6. removed printing of NCBI base href
  7. set chunksize of additional reports to 'extra_range'
  8. corrected value for 'dispmax' for additional links
  9. used gmtime instead of localtime

Changes for PubCrawler version 0.96 (26/03/99):

  1. Added variable extra_range
  2. Added variable base_URL
    Thanks to Amir Snapir for the suggestion!
  3. Added missing anchor for #TOP (back to top link)
  4. Replaced '+' with blank for query string appearing in output page
  5. Using option 'no_html' at first connection to NCBI
  6. Ignoring lines starting with '<' (HTML-commands) when reading appending config file to output
  7. Added missing check for variables 'fullmax' and 'search_URL' in check-subroutine
  8. Version number is automatically assigned from RCS variable

Changes for PubCrawler version 0.95 (25/02/99):

  1. Made 'years' and 'no limit' available as options for 'relpubdate'
  2. Command line option 'relpubdate' read in as string now
  3. Introduced variable $tool holding the name of the program requesting documents from NCBI
  4. more detailed output if 'No Documents Found'
  5. more detailed log
  6. HTML-tags are being removed from configuration file

Changes for PubCrawler version 0.94 (10/02/99):

  1. Two more functions added to sub BEGIN{}.
  2. Bug for use of alias in command line search corrected.
  3. Added line to end of PubCrawler help message
  4. Made output headline sensitive to single or multiple number of new results

Changes for PubCrawler version 0.93 (06/01/99):

  1. Warning issued if too many entries found.

Changes for PubCrawler version 0.92 (10/12/98):

  1. New option mute
    PubCrawler can be run in "mute-mode" with command line argument -mute or through configuration file (mute 1). As a result no more messages will be written to standard error (STDERR) unless an error is encountered. If verbose mode is switched off so that all log-messages are being written to a file, the program runs absolutely quiet. This would prevent mails from Cron (on Unix-systems) or open message-boxes (on MacOS) when started automatically.


  2. New format of link for additional reports
    If the amount of received reports exceeded the value of fullmax, the additional documents where presented as links. These were put on one line for each fullmax results. Now they are all written in a single line.


  3. Results presented in order of search specification
    The output file will show the results in the order the searches were specified in the configuration file. The user therefore has control about the way the results are presented.


  4. Added break inbetween NCBI-requests
    After each ENTREZ-search PubCrawler will sleep for 20 seconds before doing the next one.


  5. Improved reading of configuration file
    As an improvement to corrections for version 0.91 the stripping of white-space at the end of lines read in from the configuration file is now done by the following piece of code (approx. at line 1560):
    	...
            $line++;
            ($_) = split (/\#/);             # remove comments
            s/\s*$//;           # <--- new, clean end of line from white-space
    	...
    
    Thanks again to Danny Rice!




Bugs corrected in PubCrawler version 0.91 (08/12/98):

  1. Configuration file in current working directory was not read
    Inserted '&read_config;' in line 306 so that the whole passage reads:
    	...
    unless ($config_read) {
            # try the home-directory
        if ($ENV{'HOME'}) {
    	$old_dir = $cwd;
    	chdir $ENV{'HOME'};
    	if (-r $config_file) {
    	    &read_config;
    	} else {
    	    chdir $old_dir;
    	    &read_config;               #  <--- right here.
    	}
        } elsif (-r $config_file) {
    	...
    
    Thanks to Danny Rice!


  2. Unaccessible links for extra documents
    Links for additional entries (higher than fullmax) were not accessible.
    Added the following lines at the end of subroutine read_config (approx. at line 1620):
    	...
        $^W = $warn_stat;      # set back warning status       
        close (CONFIG);
    
        $relpubdate =~ s/\s//g;        #  <--- add      
        $viewdays =~ s/\s//g;          #  <--- add  
        $fullmax =~ s/\s//g;           #  <--- add  
        $getmax =~ s/\s//g;            #  <--- add  
    
        $config_read = 1;
    }
    	...
    



Last modified at $Date: 2008/10/23 22:13:47 $