Facebook
From Scribby Hamster, 7 Years ago, written in Plain Text.
Embed
Download Paste or View Raw
Hits: 369
  1.     Running command: /opt/funnelback/linbin/java/bin/java
  2.     With arguments: -cp /opt/funnelback/lib/java/all/*:/opt/funnelback/lib/java/groovy:/opt/funnelback/bin/funnelback-crawler.jar -server -Xms256m -Xmx640m -Dfile.encoding=UTF-8 com.funnelback.crawler.FunnelBack /opt/funnelback/VERSION/funnelback.lic /opt/funnelback/conf/LECC_web/collection.cfg
  3.     Logging STDOUT into /opt/funnelback/data/LECC_web/offline/log/crawl.log STDERR into /opt/funnelback/data/LECC_web/offline/log/crawl.log
  4.     Command will not read from STDIN
  5.     Environment: {TEMP=/tmp/1481708223539-0, LD_LIBRARY_PATH=/opt/funnelback/lib/java, TMP=/tmp/1481708223539-0, SEARCH_HOME=/opt/funnelback, java.home=/opt/funnelback/linbin/java, TMPDIR=/tmp/1481708223539-0}
  6. ####################################################################################################
  7.  
  8. FunnelBack: Version: 15.6.0.0
  9. JVM: Java HotSpot(TM) 64-Bit Server VM 25.25-b02 (Oracle Corporation)
  10. Operating System: Linux 2.6.32-642.6.2.el6.x86_64 (amd64)
  11. Encoding: UTF-8
  12. FunnelBack: Started at: Wed Dec 14 20:37:04 EST 2016
  13. FunnelBack: License verified.
  14. FunnelBack: Overall Crawl Timeout: 86400000 (ms)
  15. Funnelback: Using pre-crawl authentication.
  16. FunnelBack: Processing forms based on: /opt/funnelback/conf/LECC_web/form_interaction.cfg
  17.  
  18. FunnelBack: Loaded cookie(s) from form, forcing use of HTTPClient library for cookie support.
  19. FunnelBack: crawler.accept_cookies=true. Try setting to 'false' if authentication is not working.
  20.  
  21. FunnelBack: Configured 1 authentication cookies
  22. HTTPClient Cookies:
  23. SQ_SYSTEM_SESSION=fqgjrabl1i07s4kcau1s9k0itj8ng3hftif9buhudbrpgg23p23pq63fu8kuhbh3s7s6ahvgtc4alu3oe6fbet87o7uqp03p7c353l1; path=/; domain=lecc.clients.squiz.net
  24. FunnelBack: Additional HTTP request header: [Cookie: SQ_SYSTEM_SESSION=fqgjrabl1i07s4kcau1s9k0itj8ng3hftif9buhudbrpgg23p23pq63fu8kuhbh3s7s6ahvgtc4alu3oe6fbet87o7uqp03p7c353l1]
  25. Funnelback: Warning: crawler.packages.httplib set to 'HTTPClient', which may override any explicit Cookie: HTTP header field.
  26. FunnelBack: File Store Limit: 5000
  27. MultipleRequestsFrontier: Using specified internal frontier type for deferred request queue: com.funnelback.common.frontier.DiskFIFOFrontier
  28. FunnelBack: Loaded: com.funnelback.crawler.NetCrawler
  29. FunnelBack: Loaded: com.funnelback.common.frontier.MultipleRequestsFrontier:com.funnelback.common.frontier.DiskFIFOFrontier:1000
  30. FunnelBack: Loaded: com.funnelback.crawler.scanner.RegExpHTMLScanner
  31. FunnelBack: Loaded: com.funnelback.common.store.WarcStore
  32. FunnelBack: Loaded: com.funnelback.crawler.StandardPolicy
  33. FunnelBack: Loaded: com.funnelback.common.revisit.AlwaysRevisitPolicy
  34. Cache: Table Initial Capacity: 10000
  35. Cache: LRUCache Max Size: 500000
  36. INFO: No portfolio information file: /opt/funnelback/conf/LECC_web/sites-by-portfolio.csv
  37. INFO: No seed servers information file
  38. CrawlStatistics: Loaded statistics classes.
  39. FunnelBack: Loaded caches.
  40. FunnelBack: Mime-types parsed [text/html,text/plain,text/xml,application/xhtml+xml,application/rss+xml,application/atom+xml,application/json,application/rdf+xml,application/xml]
  41. FunnelBack: Protocols accepted [http,https]
  42. FunnelBack: Robot agent matching [FunnelBack]
  43. FunnelBack: Max Size In-Memory URL Buffer Cache: 10000
  44. FunnelBack: Storing header information
  45. Funnelback: Added 2 URLs to frontier.
  46. FunnelBack: Control passed to coordinator.
  47. Coordinator: Added 2 URLs to URL Cache from start_urls_file
  48. Coordinator: Started 20 crawler thread(s) ...
  49. Coordinator: Using overall timeout.
  50. Monitor: Interval (secs): 30 Checkpoint Interval (secs): 1800
  51. Monitor: Checking Config File: /opt/funnelback/conf/LECC_web/collection.cfg
  52. Monitor: Printing statistics to monitor.log and crawl.log.1
  53. HTTPClientTimedRequest: Trust Everyone
  54. HTTPClientTimedRequest: Accept/send all cookies
  55. Coordinator: Crawler 1 signalled completion.
  56. Coordinator: Printing out final values to servers.log and domains.log
  57. Coordinator: Final Checkpoint and Totals ...
  58. DNSCache: Maximum cache size: 200000
  59. Coordinator: Finished final checkpoint.
  60. Timing_Avg: 2016:12:14:20:37:24 8141 1078 0 1 1 0
  61. Timing_Totals_(mins): 2016:12:14:20:37:24 0 0 0 0 0 0 0
  62. Timing: Crawler 1 Processed: 2 Stored: 2 Total Crawl Time (ms): 16781
  63. Timing: Local URL Processing (ms): 16283 Calls: 2 Avg: 8141
  64. Timing: Local HTTP GET (ms): 3234 Calls: 3 Avg: 1078
  65. Timing: Local Binary Storage (ms): 0 Calls: 1 Avg: 0
  66. Timing: Local Single Address Processing (ms): 91 Calls: 74 Avg: 1
  67. Timing: Local Accept Address [incl. robots.txt] (ms): 80 Calls: 42 Avg: 1
  68. Timing: Local Get Canonical Server (ms): 0 Calls: 1 Avg: 0
  69. Date: Wed Dec 14 20:37:24 EST 2016
  70. URLs Processed: 2
  71. Duplicates: 0
  72. HTTP Redirects: 0
  73. HTTP Bad Responses: 0
  74. Network (I/O) Errors: 0
  75. Robot NoFollow URLs: 0
  76. Threads Active: 1
  77. Frontiers Active: 0
  78. Bytes In (MB): 0
  79. Bytes Out (MB): 0
  80. Used Memory (MB): 115
  81. Total Memory (MB): 309
  82. Cache Size: 2
  83. Frontier Size: 0
  84. Total Data Stored (MB): 0
  85. Total Web Servers: 1
  86. Total URLs Downloaded: 2
  87. Total URLs Stored: 2
  88. Coordinator: Printing out crawl statistics to .stat files in log directory.
  89. Coordinator: Attempting to deactivate crawler threads ...
  90. Coordinator sleeping for 5 seconds before final shutdown ...
  91. Coordinator: Closing URLStore.
  92. Coordinator: Dumping frontier to log for analysis ...
  93. Coordinator: Finished dumping frontier.
  94. Coordinator: Finished at: Wed Dec 14 20:37:29 EST 2016
  95. Coordinator: Finished crawl. Deactivating threads and exiting ...
  96. Command finished with exit code: 0