I am a bit desperate about this problem... I have no idea how to face it.
Here is a simpler way to look at this problem:
If my insert cql query is:
"BEGIN BATCH USING CONSISTENCY ONE insert into my_table(id,'2014-04-11 8:00:00',...,'2014-04-15 10:00:00') values ('2036548',3.15,...,4.11) APPLY BATCH"
...and my data request cql query is:
"Select FIRST 100000 '2014-04-01 0:00:00'..'2014-04-16 0:00:00' from my_table where id=2036548"
...why does the inserted date 2014-04-15 10:00:00 changes to 2014-04-15 11:00:00 when pullling it from Cassandra?
The date pulling code in vb.net is:
Public Shared Function getCassandraDate(ByVal value As Byte()) As Date
Dim buffer As Byte() = New Byte(value.Length - 1) {}
value.CopyTo(buffer, 0)
Array.Reverse(buffer)
Dim ticks As Long = BitConverter.ToInt64(buffer, 0)
Dim dateTime As New System.DateTime(1970, 1, 1, 0, 0, 0, _
0)
dateTime = dateTime.AddMilliseconds(ticks)
Return dateTime.ToLocalTime
End Function
...same thing in PHP:
date_default_timezone_set("Europe/Paris");
$time = $this->unpackDate($packed_time);
$str_time = date('Y-m-d H:i:s',$time); //TODO : to local time
private function unpackDate($data, $is_name=null)
{
$arr = unpack('N2', $data);
// If we are on a 32bit architecture we have to explicitly deal with
// 64-bit twos-complement arithmetic since PHP wants to treat all ints
// as signed and any int over 2^31 - 1 as a float
if (PHP_INT_SIZE == 4) {
$hi = $arr[1];
$lo = $arr[2];
$isNeg = $hi < 0;
// Check for a negative
if ($isNeg) {
$hi = ~$hi & (int)0xffffffff;
$lo = ~$lo & (int)0xffffffff;
if ($lo == (int)0xffffffff) {
$hi++;
$lo = 0;
} else {
$lo++;
}
}
// Force 32bit words in excess of 2G to pe positive - we deal wigh sign
// explicitly below
if ($hi & (int)0x80000000) {
$hi &= (int)0x7fffffff;
$hi += 0x80000000;
}
if ($lo & (int)0x80000000) {
$lo &= (int)0x7fffffff;
$lo += 0x80000000;
}
$value = $hi * 4294967296 + $lo;
if ($isNeg)
$value = 0 - $value;
} else {
// Upcast negatives in LSB bit
if ($arr[2] & 0x80000000)
$arr[2] = $arr[2] & 0xffffffff;
// Check for a negative
if ($arr[1] & 0x80000000) {
$arr[1] = $arr[1] & 0xffffffff;
$arr[1] = $arr[1] ^ 0xffffffff;
$arr[2] = $arr[2] ^ 0xffffffff;
$value = 0 - $arr[1]*4294967296 - $arr[2] - 1;
} else {
$value = $arr[1]*4294967296 + $arr[2];
}
}
return $value / 1e3;
}
MORE DETAILS
Processing chain:
(1). insertion to Cassandra through .NET
(2). Cassandra data storage
(3). Pulling the data from PHP or .NET
Problem:
As for today, a date being 2014-04-15 10:00:00 in step (1), will come out as 2014-04-15 11:00:00 in step (3).
Details:
(regarding the date format in this chain)
(1). Local time in .NET (Timezone: "Europe/Paris"). Insertion cql that is being executed: "BEGIN BATCH USING CONSISTENCY ONE insert into my_table(id,'2014-04-11 8:00:00',...,'2014-04-15 10:00:00') values ('2036548',3.15,...,4.11) APPLY BATCH"
(2). ??? I don't know what Cassandra does here... ???
(3). Example of cql query to pull the data: "Select FIRST 100000 '2014-04-01 0:00:00'..'2014-04-16 0:00:00' from my_table where id=2036548"
. In php: date_default_timezone_set("Europe/Paris"); $str_time = date('Y-m-d H:i:s',$time);
. In .NET: dateTime.ToLocalTime
.
Extra info:
I think it worked well before the daylight saving time change some weeks ago. But I can not be sure about that.
If in step (1), if I changed the date to de date to UTC before inserting it, 2014-04-15 10:00:00 will become 2014-04-15 08:00:00 and the output will be 2014-04-15 09:00:00, which is still not correct.
I highly suspect that the trick here is between steps (1) and (2), that is to say, me not being able to understand how Cassandra treats dates.
Edit1:
@Ananth 's questions:
both cassandra and client run in the same datacenter?
It is complicated:
Can you post your schema here?
Here it is
CREATE TABLE tsmeasures (
id int PRIMARY KEY
) WITH
comment='' AND
comparator=timestamp AND
read_repair_chance=0.100000 AND
gc_grace_seconds=0 AND
default_validation=double AND
min_compaction_threshold=4 AND
max_compaction_threshold=32 AND
replicate_on_write='true' AND
compaction_strategy_class='SizeTieredCompactionStrategy' AND
compression_parameters:sstable_compression='SnappyCompressor';
Edit2:
After testing it step by step, this is the result:
Ticks converted (through http://www.epochconverter.com/ )
These results makes no sense to me...
More details:
cql insert:
"BEGIN BATCH USING CONSISTENCY ONE insert into tsmeasures(id,'2014-04-11 15:00:00',...,'2014-04-15 15:00:00') values ('2036548',0,...,4.85) APPLY BATCH"
cql fetch:
"SELECT '2014-04-10 16:00:00'..'2014-04-20 17:00:00' FROM tsmeasures WHERE id IN (2036548,2036479,2036174,650877)"
Thus '2014-04-15 15:00:00' is included in the range of the fetch, and I can identify it because it is the highest value.
I will keep digging...
Before Edit
Is there a clock time sync problem between your client and cassandra? I would strictly recommend running NTP between your client and cassandra installation.
Post Edit
CREATE TABLE tsmeasures (
id int PRIMARY KEY
) WITH
comment='' AND
comparator=timestamp AND
read_repair_chance=0.100000 AND
gc_grace_seconds=0 AND
default_validation=double AND
min_compaction_threshold=4 AND
max_compaction_threshold=32 AND
replicate_on_write='true' AND
compaction_strategy_class='SizeTieredCompactionStrategy' AND
compression_parameters:sstable_compression='SnappyCompressor';
From what you have given , it looks like you are trying to get the insertion time .
Your problem might be due to clients running in different clock cycles with respect to cassandra. Cassandra just places a unix timestamp for each write.
So what is happening here from what i see.
You write from client using timestamp X(datastax driver sets this insertion timestamp). Cassandra writes with X.
You read with a timestamp Y. Cassandra tries to read with timestamp Y(So , as per your explanation, PHP client is there in a different location).
Both a are bound to differ.
Solution 1
Try to have a global NTP between the entire set up so that client clock cycles are in sync with cassandra.
Solution 2
Insert a column named timestamp which is user driven and do a range scan based on that
Solution 3
Set the insertion time in DML operations.
This seems to be a time zone issue. It appears you are neither specifying a timezone when storing nor when retrieving the timestamps. According to the documentation Cassandra applies the timezone of the coordinator node handling the write request if no timezone is supplied by the client. If timestamps shift between writing and reading them, that probably means all or some of your Cassandra nodes are not configured for the same timezone as your client is.