4

This doubt aroused when I am answering one of the question that posted here. But this question is not a continuation of that question, rather a general doubt.

Basically I need to get log-customer collection. I can't get it directly, since there is no collection defined for log customers. ie

Mage::getModel('log/customer')->getCollection(); //false

However I can get the above collection indirect way. I need to use log-visitor collection for this and then using this collection I can get log-customer collection. This is the code that do that trick.

$collection = Mage::getModel('log/visitor')->getCollection()->addFieldToFilter('store_id', array('eq' => 1));
$collection->getSelect()
            ->join( 
                array('log_customer'=> $collection->getTable('log/customer')),
                'main_table.`visitor_id`= log_customer.`visitor_id`'
            );
foreach ($collection->getItems() as  $item) {
    print_r($item->getData());
}

This code will do the trick if everything came right. But there is a small problem related to this code. In order to show what is the problem, let me show you log_customer table.

+--------+------------+-------------+---------------------+---------------------+----------+
| log_id | `visitor_id` | customer_id | login_at            | logout_at           | store_id |
+--------+------------+-------------+---------------------+---------------------+----------+
|      1 |        105 |           2 | 2014-02-05 18:49:10 | 2014-02-05 13:19:10 |        4 |
|      2 |        105 |           2 | 2014-02-05 13:19:30 | NULL                |        4 |
|      3 |        113 |           3 | 2014-08-05 10:26:08 | NULL                |        1 |
|      4 |        325 |           3 | 2014-09-04 01:32:18 | NULL                |        1 |
+--------+------------+-------------+---------------------+---------------------+----------+

Here you can see that, there are multiple entries for visitor_id = 105. That duplicate entry came because, customer didn't logout properly. (See null value of second duplicate entry). Hence if I filter log-customer collection for store_id = 4. It will throw an error, which seems like this.(the code will work as expected store_id = 1, since there is no duplicate entries)

( ! )Fatal error: Uncaught exception 'Exception' with message 'Item (Mage_Log_Model_Visitor) with the same id "105" already exist' in C:\wamp\www\magento_v1.8\lib\Varien\Data\Collection.php on line 373

( ! ) Exception: Item (Mage_Log_Model_Visitor) with the same id "105" already exist in C:\wamp\www\magento_v1.8\lib\Varien\Data\Collection.php on line 373

So in general, I need to avoid this. So how can I distinct my join query ? Please enlighten me with your beautiful thoughts.

Thanks

Rajeev K Tomy
  • 17,234
  • 6
  • 61
  • 103

4 Answers4

6

What you could consider doing is adding a group by clause for the visitor_id column.

$collection = Mage::getModel('log/visitor')->getCollection()->addFieldToFilter('store_id', array('eq' => 1));
$collection->getSelect()
    ->join(
        array('log_customer'=> $collection->getTable('log/customer')),
        'main_table.`visitor_id`= log_customer.`visitor_id`'
    )->group('main_table.visitor_id');

But you might also need to add an order by to make sure that you get the newest results first.

David Manners
  • 27,241
  • 9
  • 76
  • 220
  • Can you add any example how can I get newest results first? Tried ->order('log_customer.visitor_id'), but this doesn't help. – Adarsh Khatri Aug 25 '16 at 04:26
4
Zend_Db_Select::distinct()

The above method should help you. You could just call it by:

$collection->getSelect()->distinct();
Paras Sood
  • 2,540
  • 1
  • 14
  • 23
  • I have tried getCollection->distinct(true); didn't work. I will try your code now – Rajeev K Tomy Sep 05 '14 at 15:40
  • where should I put join() ? before or after distinct() ? – Rajeev K Tomy Sep 05 '14 at 15:41
  • 2
    The distinct method is present in the Zend_Db_Select class and not in the collection class heirarchy. Calling getSelect() on the collection returns you the object to Varien_Db_Select class that inherits from the zend class. – Paras Sood Sep 05 '14 at 15:42
  • You can put distinct wherever. The query is formed and executed only at the time collection->load() is called.The order shouldn't matter. – Paras Sood Sep 05 '14 at 15:42
  • unfortunately it didn't work. I have tried distinct(), distinct(true). – Rajeev K Tomy Sep 05 '14 at 15:47
  • 1
    After you are adding distinct and just before you load the collection do this:

    print_r((string)$collection->getSelect())

    This will show you the select query being fired. Check that.

    – Paras Sood Sep 05 '14 at 15:49
  • yep but please note I am not loading collection. I am doing it an indirect way. That is using getItems() :) – Rajeev K Tomy Sep 05 '14 at 15:53
  • Yeah get items would do a load as well. Just log it before you call getItems. – Paras Sood Sep 05 '14 at 15:59
  • SELECT DISTINCT `main_table`.*, `log_customer`.* FROM `log_visitor` AS `main_table` INNER JOIN `log_customer` ON main_table.`visitor_id`= log_customer.`visitor_id` WHERE (main_table.store_id = 1) this is the query that generating. As I guess, distinct is applying for log_visitor. I think it will work if distinct get applied to log_customer. But unfortunately it is not possible, since I cant use log_customer collection instead of log_visitor link. Please refer the link that I provided in the question – Rajeev K Tomy Sep 05 '14 at 16:06
  • +1 for the answer, since in normal case it should work. Hower my problem still persists. So I can't accept the answer right now. :) – Rajeev K Tomy Sep 05 '14 at 16:13
  • Add attribute to select as only visitor_id for main table. So the select becomes: Select Distinct main_table.visitor_id...try it... – Paras Sood Sep 05 '14 at 16:30
2

Distinct and Group, as described by David and Paras, will work on 9 out of 10 queries. If it doesn't, make sure that you aren't reusing the column name of the primary column of the collection.

The following does not work all the time because catalog_product_entity has a column named entity_id which is the primary column of the order collection

$order_collection = Mage::getModel('sales/order')
        ->getCollection();

    $product_collection = Mage::getModel('catalog/product')
        ->getCollection()
        ->addFieldToFilter('attribute_set_id', $attributeSetId)
    ;

    $order_collection->getSelect()
        ->join($sales_order_item_table,
            "$sales_order_item_table.order_id = main_table.entity_id")
        ->join(new Zend_Db_Expr('('.$product_collection->getSelect().')'),
            "t.entity_id = $sales_order_item_table.product_id")
        ->group("main_table.entity_id")
    ;

To fix it, just add an empty array as the third parameter of the join

        $order_collection->getSelect()
        ->join($sales_order_item_table,
            "$sales_order_item_table.order_id = main_table.entity_id",
            array())
        ->join(new Zend_Db_Expr('('.$product_collection->getSelect().')'),
            "t.entity_id = $sales_order_item_table.product_id",
            array())
        ->group("main_table.entity_id")
    ;
james.c.funk
  • 131
  • 3
0

While adding the join make sure blank array is passed in the third parameter.

$this->getSelect()->joinLeft(
    'sales_order_item',
    'sales_order_item.order_id = main_table.order_id',
    []
);

After that add following to apply DISTINCT

$this->getSelect()->distinct(true);
Wolfack
  • 291
  • 2
  • 9