Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

query returns one row only, my code style is es5 #4

Open
jhsea3do opened this issue Apr 25, 2016 · 6 comments
Open

query returns one row only, my code style is es5 #4

jhsea3do opened this issue Apr 25, 2016 · 6 comments

Comments

@jhsea3do
Copy link

jhsea3do commented Apr 25, 2016

Hi ufukomer,

Here ais my es5 codes, it always return the first row of results
i think it may caused by the 'pending' state, but how can i get all 50 rows after the pending finished?


> var sql = "select system_name from itm.system group by system_name";
> var client = require('node-impala').createClient({"host": "hadoop3"});
> client.query(sql, function(err, data){ console.log('err', err, 'data', data) })
{ state: 'pending' }
> err null data [ [ 'CASCECUP01:KUX' ],
[ { name: 'system_name', type: 'string', comment: '' } ] ]


> client.resultType = 'map'
'map'
> client.query(sql, function(err, data){ console.log('err', err, 'data', data) })
{ state: 'pending' }
> err null data Map { 'system_name' => [ 'CASCECUP01:KUX' ] }


> var client = require('node-impala').createClient({"host": "hadoop3", "resultType": 'map'})
undefined
> client.query(sql, function(err, data){ console.log('err', err, 'data', data) })
{ state: 'pending' }
> err null data Map { 'system_name' => [ 'CASCECUP01:KUX' ] }

Please check same query under impala-shell :

[hadoop3:21000] > select system_name from itm.system group by system_name;
Query: select system_name from itm.system group by system_name
+----------------+
| system_name |
+----------------+
| CASCECUP01:KUX |
| ASCECUP14:KUX |
| ASCECSP02:KUX |
| ASCECUP15:KUX |
| ASCECMP01:KUX |
| ASCECUP10:KUX |
| ASCECUP09:KUX |
| ESCECUP03:KUX |
| DBCECUP02:KUX |
| ASCECUP11:KUX |
| ASCECUP07:KUX |
| DBCECMP02:KUX |
| ASCECUP08:KUX |
| ASCECSP01:KUX |
| FSCECUP01:KUX |
| CASCECMP01:KUX |
| ASCECUP13:KUX |
| ASCECMP02:KUX |
| ASCECUP05:KUX |
| ASCECUP20:KUX |
| ASCECUP02:KUX |
| ASCECUP03:KUX |
| ASCECUP04:KUX |
| ESCECUP01:KUX |
| ESCECUP06:KUX |
| MQCECUP01:KUX |
| MQCECUP06:KUX |
| CESCECUP02:KUX |
| ASCECUP18:KUX |
| ASCECUP06:KUX |
| ASCECUP01:KUX |
| CASCECUP02:KUX |
| MQCECUP03:KUX |
| MQCECUP04:KUX |
| MQCECUP05:KUX |
| CESCECUP01:KUX |
| CASCECUP04:KUX |
| ESCECUP02:KUX |
| ESCECUP05:KUX |
| ASCECUP16:KUX |
| CASCECUP03:KUX |
| DBCECUP01:KUX |
| ASCECUP12:KUX |
| DBCECMP01:KUX |
| ASCECUP17:KUX |
| CFSCECUP01:KUX |
| FSCECUP02:KUX |
| ESCECUP04:KUX |
| MQCECUP02:KUX |
| ASCECUP19:KUX |
+----------------+
Fetched 50 row(s) in 0.78s

@ufukomer
Copy link
Owner

ufukomer commented Apr 26, 2016

@jhsea3do As I understand, this is the major problem of Beeswax Service. Each INSERT into HDFS creates a new data file. Unfortunately, Beeswax reads only one of them, sometimes all of them. I have inserted two sample data into sample_08 database, which means two separate data files:

sample_08

sample_08

But Beeswax reads only one of them:

// node-impala: output of query (SELECT * FROM sample_08)
[ { code: '10-0000',
    description: 'Yow',
    total_emp: '1112',
    salary: '2000' } ]

Thus, that issue never happens as long as we keep all data in one data file. But of course, that is not the solution. It seems to me that the only suitable solution is using HiveServer2 rather than Beeswax. Although, that is not straightforward way since I should implement a sasl transport something similar to its Java and Python versions.

I would be glad to hear alternative solutions if you have an idea.

@metalsquilla
Copy link

Hi! Do you have other suggestions for using Impala with Node.js?

@ufukomer
Copy link
Owner

ufukomer commented Nov 3, 2016

@tiejian create a command line app then use impala-shell via this app. I have never tried in NodeJS but there are many in GitHub, e.g. commander.js, cli. I'm not sure if these tools satisfy your need, so make your own search.

If your impala host is remote, you would probably need a socket (e.g. using socket io) to connect from command line app that runs in your local machine to command-line app that runs in remote. Hence, you could use impala-shell in this way, probably.

If I couldn't explain well, please don't hesitate to ask details.

@kwent
Copy link

kwent commented Feb 5, 2018

Hi,

Hitting that issue as well. This is a major problem and make the entire library not usable...

How can we do to help and fix it ?

Regards

@ufukomer
Copy link
Owner

ufukomer commented Feb 5, 2018

@kwent we should make this library use HiveServer2 in a way that is similar to its python client as I mentioned in the comment above.

@sunui
Copy link

sunui commented Sep 9, 2019

Hello Hello everybody,I encountered this problem recently. And maybe I found a way of bypassing this problem!But I am not sure,So I need you to verify it. and let's talk about the reason。
Similar to you,when I query SELECT * FROM atable limit 10 i got 10 rows;

[
{"a":"a","b":2},
{"a":"a","b":2},
{"a":"a","b":2},
{"a":"a","b":2},
{"a":"a","b":2},
{"a":"a","b":2},
{"a":"b","b":2},
{"a":"b","b":2},
{"a":"b","b":2},
{"a":"b","b":2},
]

While I query SELECT a,count(*) FROM atable group BY a i got only 1 row! This is where the problem lies!

[
  {
    "a": "a",
    "count(*)": "6"
  }
]

After my research I found a way to get the expected result!
I query SELECT a,count(*) FROM atable group BY a order by a!!! the order by is the key.

[
  {
    "a": "a",
    "count(*)": "6"
  },
  {
    "a": "b",
    "count(*)": "4"
  }
]

Try it!I'm Looking forward to your feedback!
@ufukomer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants