Skip to content
This repository has been archived by the owner on Feb 12, 2022. It is now read-only.
mujtabachohan edited this page Sep 12, 2013 · 32 revisions

Just started. Work in progress!

I want to get started. Is there a Phoenix Hello World?

Pre-requisite: Download latest Phoenix from here and copy phoenix-*.jar to HBase lib folder and restart HBase.

1. Using console

  1. Start Sqlline: $ sqlline.sh [zookeeper]
  2. Execute the following statements when Sqlline connects:
create table test (mykey integer not null primary key, mycolumn varchar);
upsert into test values (1,'Hello');
upsert into test values (2,'World!');
select * from test;
  1. You should get the following output
+-------+------------+
| MYKEY |  MYCOLUMN  |
+-------+------------+
| 1     | Hello      |
| 2     | World!     |
+-------+------------+

2. Using java

Create test.java file with the following content:

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.PreparedStatement;
import java.sql.Statement;

public class test {

	public static void main(String[] args) throws SQLException {
		Statement stmt = null;
		ResultSet rset = null;
		
		Connection con = DriverManager.getConnection("jdbc:phoenix:zookeeper");
		stmt = con.createStatement();
		
		stmt.executeUpdate("create table test (mykey integer not null primary key, mycolumn varchar)");
		stmt.executeUpdate("upsert into test values (1,'Hello')");
		stmt.executeUpdate("upsert into test values (2,'World!')");
		con.commit();
		
		PreparedStatement statement = con.prepareStatement("select * from test");
		rset = statement.executeQuery();
		while (rset.next()) {
			System.out.println(rset.getString("mycolumn"));
		}
		statement.close();
		con.close();
	}
}

Compile and execute on command line

$ javac test.java

$ java -cp "../phoenix-2.0.0-client.jar:." test

You should get the following output

Hello World!

Is there a way to bulk load in Phoenix?

Map Reduce

See the example here https://github.com/arunsingh16/Map-Reduce-on-Phoenix-HBase.git Credit: Arun Singh

CSV

CSV data can be bulk loaded with built in utility named psql. Typical upsert rates are 20K - 50K rows per second (depends on how wide are the rows).

Usage example:
Create table using psql $ psql.sh [zookeeper] ../examples/web_stat.sql

Upsert CSV bulk data $ psql.sh [zookeeper] ../examples/web_stat.csv

How I create Views in Phoenix? What's the difference between Views/Tables?

Are there any tips for designing Phoenix schema?

  • Use Salting to increase read/write performance Salting can significantly increase read/write performance by pre-splitting the data into multiple regions.

How do I create Index on a column?

How fast is Phoenix? Why is it so fast?

Phoenix is fast. Full table scan of 100M rows usually completes in 20 seconds (narrow table on a medium sized cluster). This time come down to few milliseconds if query contains filter on key columns. For filters on non-key columns or non-leading key columns, you can add index on these columns which leads to performance equivalent to filtering on key column by making copy of table with indexed column(s) part of key.

Why is Phoenix fast even when doing full scan:

  1. Phoenix chunks up your query using the region boundaries and runs them in parallel on the client using a configurable number of threads
  2. The aggregation will be done in a coprocessor on the server-side, collapsing the amount of data that gets returned back to the client rather than returning it all.
Clone this wiki locally