0

I am working with Entity Framework. I am retrieving data using a query with .Where and .Select conditions.

var myData = await _dbContext.Samples.Include(i => i.Experiment)
                                     .Include(i => i.Experiment.Test)
                                     .Include(i => i.Experiment.Test.Project)
                                     .Include(i => i.Samples)
                                     .Where(i => i.Experiment.Test.Status == 3 && i.Experiment.Test.TestId == 3)
                                     .Select( e => new ExperimentCollections() {
                                     ExperimentNumber = e.Experiment.Test.ExperimentNumber,
                                     ExperimentName = e.Experiment.Test.Project.Name
                                     }).ToListAsync();

There can be multiple rows with the same ExperimentNumber and I need to avoid it from inserting into myData.

For example: in myData:

myData[0]
ExperimentNumber: 1520,
ExperimentName: ABC

myData[1]
ExperimentNumber: 1521,
ExperimentName: EFG

myData[2]
ExperimentNumber: 1520,
ExperimentName: HIJ

I need to avoid myData[2] data to be inserted into myData using Entity Framework in single single query without using foreach loop.

I tried following code with .Distinct()

var myData = await _dbContext.Samples.Include(i => i.Experiment)
                                     .Include(i => i.Experiment.Test)
                                     .Include(i => i.Experiment.Test.Project)
                                     .Include(i => i.Samples)
                                     .Where(i => i.Experiment.Test.Status == 3 && i.Experiment.Test.TestId == 3)
                                     .Select( e => new ExperimentCollections() {
                                     ExperimentNumber = e.Experiment.Test.ExperimentNumber,
                                     ExperimentName = e.Experiment.Test.Project.Name
                                     }).Distinct().ToListAsync();

And I even tried .Distinct().OrderBy(i => i.TestNumber) too.

Abhi Singh
  • 139
  • 8

2 Answers2

0

The GROUP BY statement is often used with aggregate functions (COUNT, MAX, MIN, SUM, AVG) to group the result-set by one or more columns.

After Where clause use GROUP BY column_name(s).

MD. RAKIB HASAN
  • 2,587
  • 4
  • 17
  • 29
0

You can try to use Extend method to avoid same data in the list.

 public static class Extend
    {
        public static IEnumerable<TSource> DistinctBy<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
        {

            HashSet<TKey> seenKeys = new HashSet<TKey>();

            foreach (TSource element in source)
            {
                if (seenKeys.Add(keySelector(element)))
                {
                    yield return element;
                }
            }
        }

    }

You can use it like the following code:

var query = data.DistinctBy(p => p.ExperimentNumber).ToList();
            foreach (var item in query)
            {
                Console.WriteLine(item.ExperimentName+" "+item.ExperimentNumber);
            }

Result:

enter image description here

Jack J Jun - MSFT
  • 5,164
  • 1
  • 7
  • 19