环境: CentOS 5.7
mvn clean package -DskipTests
量化自我和极简主义的窝藏点
技术
环境: CentOS 5.7
mvn clean package -DskipTests
环境: CentOS
在使用Maven编译一些与hadoop相关的产品时候需要使用hadoop相关版本对应的核心组件,而自己使用的大多数都是CDH版本。因而需要从些版本上下载相应的包。
相应的解决方法是在pom.xml增加如下:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<repositories>
<repository>
<id>cloudera</id>
<url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
</repository>
</repositories>
</project>
以下显示的是project name, groupId, artifactId, and version required to access each CDH4 artifact.
| Project | groupId | artifactId | version |
|---|---|---|---|
| Hadoop | org.apache.hadoop | hadoop-annotations | 2.0.0-cdh4.2.0 |
| org.apache.hadoop | hadoop-archives | 2.0.0-cdh4.2.0 | |
| org.apache.hadoop | hadoop-assemblies | 2.0.0-cdh4.2.0 | |
| org.apache.hadoop | hadoop-auth | 2.0.0-cdh4.2.0 | |
| org.apache.hadoop | hadoop-client | 2.0.0-cdh4.2.0 | |
| org.apache.hadoop | hadoop-common | 2.0.0-cdh4.2.0 | |
| org.apache.hadoop | hadoop-datajoin | 2.0.0-cdh4.2.0 | |
| org.apache.hadoop | hadoop-dist | 2.0.0-cdh4.2.0 | |
| org.apache.hadoop | hadoop-distcp | 2.0.0-cdh4.2.0 | |
| org.apache.hadoop | hadoop-extras | 2.0.0-cdh4.2.0 | |
| org.apache.hadoop | hadoop-gridmix | 2.0.0-cdh4.2.0 | |
| org.apache.hadoop | hadoop-hdfs | 2.0.0-cdh4.2.0 | |
| org.apache.hadoop | hadoop-mapreduce-client-app | 2.0.0-cdh4.2.0 | |
| org.apache.hadoop | hadoop-mapreduce-client-common | 2.0.0-cdh4.2.0 | |
| org.apache.hadoop | hadoop-mapreduce-client-core | 2.0.0-cdh4.2.0 | |
| org.apache.hadoop | hadoop-mapreduce-client-hs | 2.0.0-cdh4.2.0 | |
| org.apache.hadoop | hadoop-mapreduce-client-jobclient | 2.0.0-cdh4.2.0 | |
| org.apache.hadoop | hadoop-mapreduce-client-shuffle | 2.0.0-cdh4.2.0 | |
| org.apache.hadoop | hadoop-mapreduce-examples | 2.0.0-cdh4.2.0 | |
| org.apache.hadoop | hadoop-rumen | 2.0.0-cdh4.2.0 | |
| org.apache.hadoop | hadoop-yarn-api | 2.0.0-cdh4.2.0 | |
| org.apache.hadoop | hadoop-yarn-applications-distributedshell | 2.0.0-cdh4.2.0 | |
| org.apache.hadoop | hadoop-yarn-applications-unmanaged-am-launcher | 2.0.0-cdh4.2.0 | |
| org.apache.hadoop | hadoop-yarn-client | 2.0.0-cdh4.2.0 | |
| org.apache.hadoop | hadoop-yarn-common | 2.0.0-cdh4.2.0 | |
| org.apache.hadoop | hadoop-yarn-server-common | 2.0.0-cdh4.2.0 | |
| org.apache.hadoop | hadoop-yarn-server-nodemanager | 2.0.0-cdh4.2.0 | |
| org.apache.hadoop | hadoop-yarn-server-resourcemanager | 2.0.0-cdh4.2.0 | |
| org.apache.hadoop | hadoop-yarn-server-tests | 2.0.0-cdh4.2.0 | |
| org.apache.hadoop | hadoop-yarn-server-web-proxy | 2.0.0-cdh4.2.0 | |
| org.apache.hadoop | hadoop-yarn-site | 2.0.0-cdh4.2.0 | |
| Hadoop MRv1 | org.apache.hadoop | hadoop-core | 2.0.0-mr1-cdh4.2.0 |
| org.apache.hadoop | hadoop-examples | 2.0.0-mr1-cdh4.2.0 | |
| org.apache.hadoop | hadoop-minicluster | 2.0.0-mr1-cdh4.2.0 | |
| org.apache.hadoop | hadoop-streaming | 2.0.0-mr1-cdh4.2.0 | |
| org.apache.hadoop | hadoop-test | 2.0.0-mr1-cdh4.2.0 | |
| org.apache.hadoop | hadoop-tools | 2.0.0-mr1-cdh4.2.0 | |
| Hive | org.apache.hive | hive-anttasks | 0.10.0-cdh4.2.0 |
| org.apache.hive | hive-builtins | 0.10.0-cdh4.2.0 | |
| org.apache.hive | hive-cli | 0.10.0-cdh4.2.0 | |
| org.apache.hive | hive-common | 0.10.0-cdh4.2.0 | |
| org.apache.hive | hive-contrib | 0.10.0-cdh4.2.0 | |
| org.apache.hive | hive-exec | 0.10.0-cdh4.2.0 | |
| org.apache.hive | hive-hbase-handler | 0.10.0-cdh4.2.0 | |
| org.apache.hive | hive-hwi | 0.10.0-cdh4.2.0 | |
| org.apache.hive | hive-jdbc | 0.10.0-cdh4.2.0 | |
| org.apache.hive | hive-metastore | 0.10.0-cdh4.2.0 | |
| org.apache.hive | hive-pdk | 0.10.0-cdh4.2.0 | |
| org.apache.hive | hive-serde | 0.10.0-cdh4.2.0 | |
| org.apache.hive | hive-service | 0.10.0-cdh4.2.0 | |
| org.apache.hive | hive-shims | 0.10.0-cdh4.2.0 | |
| HBase | org.apache.hbase | hbase | 0.94.2-cdh4.2.0 |
| ZooKeeper | org.apache.zookeeper | zookeeper | 3.4.5-cdh4.2.0 |
| Sqoop | org.apache.sqoop | sqoop | 1.4.2-cdh4.2.0 |
| Pig | org.apache.pig | pig | 0.10.0-cdh4.2.0 |
| org.apache.pig | pigsmoke | 0.10.0-cdh4.2.0 | |
| org.apache.pig | pigunit | 0.10.0-cdh4.2.0 | |
| Flume 1.x | org.apache.flume | flume-ng-configuration | 1.3.0-cdh4.2.0 |
| org.apache.flume | flume-ng-core | 1.3.0-cdh4.2.0 | |
| org.apache.flume | flume-ng-embedded-agent | 1.3.0-cdh4.2.0 | |
| org.apache.flume | flume-ng-node | 1.3.0-cdh4.2.0 | |
| org.apache.flume | flume-ng-sdk | 1.3.0-cdh4.2.0 | |
| org.apache.flume | flume-ng-tests | 1.3.0-cdh4.2.0 | |
| org.apache.flume.flume-ng-channels | flume-file-channel | 1.3.0-cdh4.2.0 | |
| org.apache.flume.flume-ng-channels | flume-jdbc-channel | 1.3.0-cdh4.2.0 | |
| org.apache.flume.flume-ng-channels | flume-recoverable-memory-channel | 1.3.0-cdh4.2.0 | |
| org.apache.flume.flume-ng-clients | flume-ng-log4jappender | 1.3.0-cdh4.2.0 | |
| org.apache.flume.flume-ng-legacy-sources | flume-avro-source | 1.3.0-cdh4.2.0 | |
| org.apache.flume.flume-ng-legacy-sources | flume-thrift-source | 1.3.0-cdh4.2.0 | |
| org.apache.flume.flume-ng-sinks | flume-hdfs-sink | 1.3.0-cdh4.2.0 | |
| org.apache.flume.flume-ng-sinks | flume-irc-sink | 1.3.0-cdh4.2.0 | |
| org.apache.flume.flume-ng-sinks | flume-ng-elasticsearch-sink | 1.3.0-cdh4.2.0 | |
| org.apache.flume.flume-ng-sinks | flume-ng-hbase-sink | 1.3.0-cdh4.2.0 | |
| org.apache.flume.flume-ng-sources | flume-jms-source | 1.3.0-cdh4.2.0 | |
| org.apache.flume.flume-ng-sources | flume-scribe-source | 1.3.0-cdh4.2.0 | |
| Oozie | org.apache.oozie | oozie-client | 3.3.0-cdh4.2.0 |
| org.apache.oozie | oozie-core | 3.3.0-cdh4.2.0 | |
| org.apache.oozie | oozie-examples | 3.3.0-cdh4.2.0 | |
| org.apache.oozie | oozie-hadoop | 2.0.0-cdh4.2.0.oozie-3.3.0-cdh4.2.0 | |
| org.apache.oozie | oozie-hadoop-distcp | 2.0.0-mr1-cdh4.2.0.oozie-3.3.0-cdh4.2.0 | |
| org.apache.oozie | oozie-hadoop-test | 2.0.0-mr1-cdh4.2.0.oozie-3.3.0-cdh4.2.0 | |
| org.apache.oozie | oozie-hbase | 0.94.2-cdh4.2.0.oozie-3.3.0-cdh4.2.0 | |
| org.apache.oozie | oozie-sharelib-distcp | 3.3.0-cdh4.2.0 | |
| org.apache.oozie | oozie-sharelib-distcp-yarn | 3.3.0-cdh4.2.0 | |
| org.apache.oozie | oozie-sharelib-hive | 3.3.0-cdh4.2.0 | |
| org.apache.oozie | oozie-sharelib-oozie | 3.3.0-cdh4.2.0 | |
| org.apache.oozie | oozie-sharelib-pig | 3.3.0-cdh4.2.0 | |
| org.apache.oozie | oozie-sharelib-sqoop | 3.3.0-cdh4.2.0 | |
| org.apache.oozie | oozie-sharelib-streaming | 3.3.0-cdh4.2.0 | |
| org.apache.oozie | oozie-sharelib-streaming-yarn | 3.3.0-cdh4.2.0 | |
| org.apache.oozie | oozie-tools | 3.3.0-cdh4.2.0 | |
| Mahout | org.apache.mahout | mahout-buildtools | 0.7-cdh4.2.0 |
| org.apache.mahout | mahout-core | 0.7-cdh4.2.0 | |
| org.apache.mahout | mahout-examples | 0.7-cdh4.2.0 | |
| org.apache.mahout | mahout-integration | 0.7-cdh4.2.0 | |
| org.apache.mahout | mahout-math | 0.7-cdh4.2.0 | |
| Whirr | org.apache.whirr | whirr-build-tools | 0.8.0-cdh4.2.0 |
| org.apache.whirr | whirr-cassandra | 0.8.0-cdh4.2.0 | |
| org.apache.whirr | whirr-cdh | 0.8.0-cdh4.2.0 | |
| org.apache.whirr | whirr-chef | 0.8.0-cdh4.2.0 | |
| org.apache.whirr | whirr-cli | 0.8.0-cdh4.2.0 | |
| org.apache.whirr | whirr-core | 0.8.0-cdh4.2.0 | |
| org.apache.whirr | whirr-elasticsearch | 0.8.0-cdh4.2.0 | |
| org.apache.whirr | whirr-examples | 0.8.0-cdh4.2.0 | |
| org.apache.whirr | whirr-ganglia | 0.8.0-cdh4.2.0 | |
| org.apache.whirr | whirr-hadoop | 0.8.0-cdh4.2.0 | |
| org.apache.whirr | whirr-hama | 0.8.0-cdh4.2.0 | |
| org.apache.whirr | whirr-hbase | 0.8.0-cdh4.2.0 | |
| org.apache.whirr | whirr-mahout | 0.8.0-cdh4.2.0 | |
| org.apache.whirr | whirr-pig | 0.8.0-cdh4.2.0 | |
| org.apache.whirr | whirr-puppet | 0.8.0-cdh4.2.0 | |
| org.apache.whirr | whirr-solr | 0.8.0-cdh4.2.0 | |
| org.apache.whirr | whirr-yarn | 0.8.0-cdh4.2.0 | |
| org.apache.whirr | whirr-zookeeper | 0.8.0-cdh4.2.0 | |
| DataFu | com.linkedin.datafu | datafu | 0.0.4-cdh4.2.0 |
| Sqoop2 | org.apache.sqoop | sqoop-client | 1.99.1-cdh4.2.0 |
| org.apache.sqoop | sqoop-common | 1.99.1-cdh4.2.0 | |
| org.apache.sqoop | sqoop-core | 1.99.1-cdh4.2.0 | |
| org.apache.sqoop | sqoop-docs | 1.99.1-cdh4.2.0 | |
| org.apache.sqoop | sqoop-spi | 1.99.1-cdh4.2.0 | |
| org.apache.sqoop.connector | sqoop-connector-generic-jdbc | 1.99.1-cdh4.2.0 | |
| org.apache.sqoop.repository | sqoop-repository-derby | 1.99.1-cdh4.2.0 | |
| HCatalog | org.apache.hcatalog | hcatalog-core | 0.4.0-cdh4.2.0 |
| org.apache.hcatalog | hcatalog-pig-adapter | 0.4.0-cdh4.2.0 | |
| org.apache.hcatalog | hcatalog-server-extensions | 0.4.0-cdh4.2.0 | |
| org.apache.hcatalog | webhcat | 0.4.0-cdh4.2.0 | |
| org.apache.hcatalog | webhcat-java-client | 0.4.0-cdh4.2.0 |
环境: Centos 5.7, CDH 4.3, sqoop 1.6
从http://downloads.cloudera.com/connectors/oraoop-1.6.0-cdh4.tgz 下载oraoop,解压生成
[oracle@xxx ~]$ ls oraoop-1.6.0
bin conf docs install.sh version.txt
设置环境参数vi ~/.bash_profile
export SQOOP_CONF_DIR=/etc/sqoop/conf
export SQOOP_HOME=/u01/cloudera/parcels/CDH/lib/sqoop
export HADOOP_CLIENT_OPTS=”-Xmx2048m $HADOOP_CLIENT_OPTS”
然后执行安装脚本./install.sh, 测试安装效果:
[oracle@xxx ~]$ sqoop list-tables –verbose –connect jdbc:oracle:thin:@xxx:8521:biprod –username xxx –password xxx
14/09/23 18:39:49 DEBUG tool.BaseSqoopTool: Enabled debug logging.
14/09/23 18:39:49 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
14/09/23 18:39:49 DEBUG util.ClassLoaderStack: Checking for existing class: com.quest.oraoop.OraOopManagerFactory
14/09/23 18:39:49 DEBUG util.ClassLoaderStack: Class is already available. Skipping jar /u01/cloudera/parcels/CDH/lib/sqoop/lib/oraoop-1.6.0.jar
14/09/23 18:39:49 DEBUG sqoop.ConnFactory: Added factory com.quest.oraoop.OraOopManagerFactory in jar /u01/cloudera/parcels/CDH/lib/sqoop/lib/oraoop-1.6.0.jar specified by /etc/sqoop/conf/managers.d/oraoop
14/09/23 18:39:49 DEBUG sqoop.ConnFactory: Loaded manager factory: com.quest.oraoop.OraOopManagerFactory
14/09/23 18:39:49 DEBUG sqoop.ConnFactory: Loaded manager factory: com.cloudera.sqoop.manager.DefaultManagerFactory
14/09/23 18:39:49 DEBUG sqoop.ConnFactory: Trying ManagerFactory: com.quest.oraoop.OraOopManagerFactory
14/09/23 18:39:49 DEBUG sqoop.ConnFactory: Trying ManagerFactory: com.cloudera.sqoop.manager.DefaultManagerFactory
14/09/23 18:39:49 DEBUG manager.DefaultManagerFactory: Trying with scheme: jdbc:oracle:thin:@xxx:8521
14/09/23 18:39:49 DEBUG manager.OracleManager$ConnCache: Instantiated new connection cache.
14/09/23 18:39:49 INFO manager.SqlManager: Using default fetchSize of 1000
14/09/23 18:39:49 DEBUG sqoop.ConnFactory: Instantiated ConnManager org.apache.sqoop.manager.OracleManager@52f6438d
14/09/23 18:39:49 DEBUG manager.OracleManager: Creating a new connection for jdbc:oracle:thin:@xxx:8521:biprod, using username: SQOOP_USER
14/09/23 18:39:49 DEBUG manager.OracleManager: No connection paramenters specified. Using regular API for making connection.
14/09/23 18:40:01 INFO manager.OracleManager: Time zone has been set to GMT
14/09/23 18:40:02 DEBUG manager.OracleManager$ConnCache: Caching released connection for jdbc:oracle:thin:@xxx:8521:biprod/SQOOP_USER
OS_ZHIXIN_CHG
T1
环境: CentOS 6.2
h2o-sparking 是h2o与spark结合的产物,用于机器学习这一方面,它可在spark环境中使用h2o拥有的机器学习包。
安装如下 :
git clone https://github.com/0xdata/h2o-sparkling.git
cd h2o-sparking
sbt assembly
运行测试:
[cloudera@localhost h2o-sparkling]$ sbt -mem 500 “run –local”
[info] Loading project definition from /home/cloudera/h2o-sparkling/project
[info] Set current project to h2o-sparkling-demo (in build file:/home/cloudera/h2o-sparkling/)
[info] Running water.sparkling.demo.SparklingDemo –local
03:41:11.030 main INFO WATER: —– H2O started —–
03:41:11.046 main INFO WATER: Build git branch: (unknown)
03:41:11.047 main INFO WATER: Build git hash: (unknown)
03:41:11.047 main INFO WATER: Build git describe: (unknown)
03:41:11.047 main INFO WATER: Build project version: (unknown)
03:41:11.047 main INFO WATER: Built by: ‘(unknown)’
03:41:11.047 main INFO WATER: Built on: ‘(unknown)’
03:41:11.048 main INFO WATER: Java availableProcessors: 1
03:41:11.077 main INFO WATER: Java heap totalMemory: 3.87 gb
03:41:11.077 main INFO WATER: Java heap maxMemory: 3.87 gb
03:41:11.078 main INFO WATER: Java version: Java 1.6.0_31 (from Sun Microsystems Inc.)
03:41:11.078 main INFO WATER: OS version: Linux 2.6.32-220.23.1.el6.x86_64 (amd64)
03:41:11.381 main INFO WATER: Machine physical memory: 4.83 gb
03:41:11.393 main INFO WATER: ICE root: ‘/tmp/h2o-cloudera’
03:41:11.438 main INFO WATER: Possible IP Address: eth1 (eth1), 192.168.56.101
03:41:11.439 main INFO WATER: Possible IP Address: eth0 (eth0), 10.0.2.15
03:41:11.439 main INFO WATER: Possible IP Address: lo (lo), 127.0.0.1
03:41:11.669 main WARN WATER: Multiple local IPs detected:
+ /192.168.56.101 /10.0.2.15
+ Attempting to determine correct address…
+ Using /10.0.2.15
03:41:11.929 main INFO WATER: Internal communication uses port: 54322
+ Listening for HTTP and REST traffic on http://10.0.2.15:54321/
03:41:12.912 main INFO WATER: H2O cloud name: ‘cloudera’
03:41:12.913 main INFO WATER: (v(unknown)) ‘cloudera’ on /10.0.2.15:54321, discovery address /230.63.2.255:58943
03:41:12.913 main INFO WATER: If you have trouble connecting, try SSH tunneling from your local machine (e.g., via port 55555):
+ 1. Open a terminal and run ‘ssh -L 55555:localhost:54321 cloudera@10.0.2.15’
+ 2. Point your browser to http://localhost:55555
03:41:12.954 main INFO WATER: Cloud of size 1 formed [/10.0.2.15:54321 (00:00:00.000)]
03:41:12.954 main INFO WATER: Log dir: ‘/tmp/h2o-cloudera/h2ologs’
prostate
03:41:20.369 main INFO WATER: Running demo with following configuration: DemoConf(prostate,true,RDDExtractor@file,true)
03:41:20.409 main INFO WATER: Demo configuration: DemoConf(prostate,true,RDDExtractor@file,true)
03:41:21.830 main INFO WATER: Data : data/prostate.csv
03:41:21.831 main INFO WATER: Table: prostate_table
03:41:21.831 main INFO WATER: Query: SELECT * FROM prostate_table WHERE capsule=1
03:41:21.831 main INFO WATER: Spark: LOCAL
03:41:21.901 main INFO WATER: Creating LOCAL Spark context.
03:41:34.616 main INFO WATER: RDD result has: 153 rows
03:41:34.752 main INFO WATER: Going to write RDD into /tmp/rdd_null_6.csv
03:41:36.099 FJ-0-1 INFO WATER: Parse result for rdd_data_6 (153 rows):
03:41:36.136 FJ-0-1 INFO WATER: C1: numeric min(6.000000) max(378.000000)
03:41:36.140 FJ-0-1 INFO WATER: C2: numeric min(1.000000) max(1.000000) constant
03:41:36.146 FJ-0-1 INFO WATER: C3: numeric min(47.000000) max(79.000000)
03:41:36.152 FJ-0-1 INFO WATER: C4: numeric min(0.000000) max(2.000000)
03:41:36.158 FJ-0-1 INFO WATER: C5: numeric min(1.000000) max(4.000000)
03:41:36.161 FJ-0-1 INFO WATER: C6: numeric min(1.000000) max(2.000000)
03:41:36.165 FJ-0-1 INFO WATER: C7: numeric min(1.400000) max(139.700000)
03:41:36.169 FJ-0-1 INFO WATER: C8: numeric min(0.000000) max(73.400000)
03:41:36.176 FJ-0-1 INFO WATER: C9: numeric min(5.000000) max(9.000000)
03:41:37.457 main INFO WATER: Extracted frame from Spark:
03:41:37.474 main INFO WATER: {id,capsule,age,race,dpros,dcaps,psa,vol,gleason}, 2.8 KB
+ Chunk starts: {0,83,}
+ Rows: 153
03:41:37.482 #ti-UDP-R INFO WATER: Orderly shutdown command from /10.0.2.15:54321
[success] Total time: 44 s, completed Aug 4, 2014 3:41:37 AM
本地集群运行:
[cloudera@localhost h2o-sparkling]$ sbt -mem 100 “run –remote”
[info] Loading project definition from /home/cloudera/h2o-sparkling/project
[info] Set current project to h2o-sparkling-demo (in build file:/home/cloudera/h2o-sparkling/)
[info] Running water.sparkling.demo.SparklingDemo –remote
03:25:42.306 main INFO WATER: —– H2O started —–
03:25:42.309 main INFO WATER: Build git branch: (unknown)
03:25:42.309 main INFO WATER: Build git hash: (unknown)
03:25:42.309 main INFO WATER: Build git describe: (unknown)
03:25:42.309 main INFO WATER: Build project version: (unknown)
03:25:42.309 main INFO WATER: Built by: ‘(unknown)’
03:25:42.309 main INFO WATER: Built on: ‘(unknown)’
03:25:42.310 main INFO WATER: Java availableProcessors: 4
03:25:42.316 main INFO WATER: Java heap totalMemory: 3.83 gb
03:25:42.316 main INFO WATER: Java heap maxMemory: 3.83 gb
03:25:42.316 main INFO WATER: Java version: Java 1.6.0_31 (from Sun Microsystems Inc.)
03:25:42.317 main INFO WATER: OS version: Linux 2.6.32-220.23.1.el6.x86_64 (amd64)
03:25:42.383 main INFO WATER: Machine physical memory: 4.95 gb
03:25:42.384 main INFO WATER: ICE root: ‘/tmp/h2o-cloudera’
03:25:42.389 main INFO WATER: Possible IP Address: eth1 (eth1), 192.168.56.101
03:25:42.389 main INFO WATER: Possible IP Address: eth0 (eth0), 10.0.2.15
03:25:42.389 main INFO WATER: Possible IP Address: lo (lo), 127.0.0.1
03:25:42.587 main WARN WATER: Multiple local IPs detected:
+ /192.168.56.101 /10.0.2.15
+ Attempting to determine correct address…
+ Using /10.0.2.15
03:25:42.650 main INFO WATER: Internal communication uses port: 54322
+ Listening for HTTP and REST traffic on http://10.0.2.15:54321/
03:25:43.906 main INFO WATER: H2O cloud name: ‘cloudera’
03:25:43.906 main INFO WATER: (v(unknown)) ‘cloudera’ on /10.0.2.15:54321, discovery address /230.63.2.255:58943
03:25:43.907 main INFO WATER: If you have trouble connecting, try SSH tunneling from your local machine (e.g., via port 55555):
+ 1. Open a terminal and run ‘ssh -L 55555:localhost:54321 cloudera@10.0.2.15’
+ 2. Point your browser to http://localhost:55555
03:25:43.920 main INFO WATER: Cloud of size 1 formed [/10.0.2.15:54321 (00:00:00.000)]
03:25:43.921 main INFO WATER: Log dir: ‘/tmp/h2o-cloudera/h2ologs’
prostate
03:25:46.985 main INFO WATER: Running demo with following configuration: DemoConf(prostate,false,RDDExtractor@file,true)
03:25:46.991 main INFO WATER: Demo configuration: DemoConf(prostate,false,RDDExtractor@file,true)
03:25:48.000 main INFO WATER: Data : data/prostate.csv
03:25:48.000 main INFO WATER: Table: prostate_table
03:25:48.000 main INFO WATER: Query: SELECT * FROM prostate_table WHERE capsule=1
03:25:48.001 main INFO WATER: Spark: REMOTE
03:25:48.024 main INFO WATER: Creating REMOTE (spark://localhost:7077) Spark context.
org.apache.spark.SparkException: Job aborted due to stage failure: Task 1.0:1 failed 4 times, most recent failure: TID 7 on host 192.168.56.101 failed for unknown reason
Driver stacktrace:
03:26:07.151 main INFO WATER: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1033)
03:26:07.151 main INFO WATER: at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1017)
03:26:07.151 main INFO WATER: at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1015)
03:26:07.152 main INFO WATER: at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
03:26:07.152 main INFO WATER: at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
03:26:07.152 main INFO WATER: at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1015)
03:26:07.152 main INFO WATER: at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633)
03:26:07.152 main INFO WATER: at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633)
03:26:07.153 main INFO WATER: at scala.Option.foreach(Option.scala:236)
03:26:07.153 main INFO WATER: at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:633)
03:26:07.153 main INFO WATER: at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1207)
03:26:07.153 main INFO WATER: at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
03:26:07.155 main INFO WATER: at akka.actor.ActorCell.invoke(ActorCell.scala:456)
03:26:07.155 main INFO WATER: at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
03:26:07.156 main INFO WATER: at akka.dispatch.Mailbox.run(Mailbox.scala:219)
03:26:07.156 main INFO WATER: at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
03:26:07.157 main INFO WATER: at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
03:26:07.158 main INFO WATER: at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
03:26:07.158 main INFO WATER: at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
03:26:07.162 main INFO WATER: at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
03:26:07.172 #ti-UDP-R INFO WATER: Orderly shutdown command from /10.0.2.15:54321
[success] Total time: 27 s, completed Aug 4, 2014 3:26:07 PM
2. 启动命令:
sqoop export –connect jdbc:oracle:thin:@xxx:1521:biprod –username sqoop_user –password sqoop_user –table OS_ZHIXIN_CHG –export-dir /tmp/zhixin_chg/20140911/20140911charge.zhixin2.log
环境: Ubuntu 12.04, Postgresql 9.1, WordPress 3.4.2
1. 环境安装
sudo apt-get install apache2
sudo apt-get install postgresql-9.1
sudo apt-get install php5
sudo apt-get install php5-pgsql
2. 下载wordpress,
wget -O wordpress.tar.gz http://wordpress.org/latest.tar.gz
wget https://downloads.wordpress.org/plugin/postgresql-for-wordpress.1.3.1.zip
3. 解压并放到/var/www目录下
unzip latest.tar.gz
unzip postgresql-for-wordpress.1.3.1.zip
sudo cp -R wordpress /var/www
sudo chown jerry:jerry /var/www/wordpress
sudo cp -R postgresql-for-wordpress/pg4wp /var/www/wordpress/wp-content/
cp /var/www/wordpress/wp-content/pg4wp/db.php /var/www/wordpress/wp-content
4. 切换到/var/www/wordpress目录,拷贝一份wp-config-sample.php文件为wp-config.php
vi wp-config.php
修改这四项为postgresql的配置参数
define(‘DB_NAME’, ‘wordpress’);
/** MySQL database username */
define(‘DB_USER’, ‘postgres’);
/** MySQL database password */
define(‘DB_PASSWORD’, ‘xxxxxxx’);
/** MySQL hostname */
define(‘DB_HOST’, ‘localhost:5432’);
以后可以有自己的博客了!![]()
参考以下文件:
https://github.com/txthinking/google-hosts
环境: Windows 7, pidgin-2.10.9, send-screenshot-v0.8-3
pidgin默认安装是不支持屏幕截图发送功能,但可以通过插件来弥补这一功能。 从https://code.google.com/p/pidgin-sendscreenshot/downloads/list下载send-screenshot-v0.8-3.exe,安装些插件。
在“对话” –》 “更多(o) –> “截图发送”, 使用此项就能实现了。
环境: Ubuntu 12.04
chown u+x /etc/sudoers
vi /etc/sudoers
增加一行
jerry ALL=(ALL:ALL) ALL
chown u-x /etc/sudoers
环境: CentOS 5.7
1. 配置好免密码登录的SSH
场景: 主机A, B, A访问B
首先在主机A执行
[oracle@A ~]$ ssh-keygen -t rsa -P ”
[oracle@A ~]$ scp .ssh/id_rsa.pub oracle@B:/home/oracle
主机B执行:
[oracle@B ~]$ cat /home/oracle/id_rsa.pub > ~/.ssh/authorized_keys
[oracle@B ~]$ chmod 600 ~/.ssh/authorized_keys
[oracle@B ~]$ chmod 700 ~/.ssh
2. vi test_ssh.sh
脚本如下
#!/bin/sh
cmd=”
cd /home/oracle
. ~/.bash_profile
ls
python load_zhixin.py “$1″
”
echo $cmd
ssh oracle@xx.xx.xx.xx “$cmd”
3. 执行如下 ./test_ssh.sh 20140917