I have an instance of MongoDB 4.0.6 with WiredTiger and the documentation says that db.serverStatus() has a storageEngine section in the output with some info on the Wired Tiger flags. On my instance there's no storageEngine section in the output at all. Am I missing something?
To be specific, I have a sharded cluster with a 3 node replica set where one of the nodes is an arbiter, which leads to an ever growing WiredTigerLAS.wt file when creating many records quickly. The documentation says we can turn off readConcernMajority with '--enableMajorityReadConcern false', which we have done. Does not stop the lookaside file growth. I'd like to check the status of that flag, and again following the documentation I'm using db.serverStatus(), but in contrast to the documentation the output contains no storageEngine section at all.
Maybe I should just change the arbiter into a data bearing node, but I was hoping to avoid the cost. Any help on this issue is appreciated.
Adding rs.status() (note that the secondary is currently resynching from the primary):
{
"set" : "shardRS1",
"date" : ISODate("2019-06-06T17:09:06.986Z"),
"myState" : 2,
"term" : NumberLong(66),
"syncingTo" : "1b:27018",
"syncSourceHost" : "1b:27018",
"syncSourceId" : 0,
"heartbeatIntervalMillis" : NumberLong(2000),
"optimes" : {
"lastCommittedOpTime" : {
"ts" : Timestamp(1559840946, 1),
"t" : NumberLong(66)
},
"readConcernMajorityOpTime" : {
"ts" : Timestamp(1559840946, 1),
"t" : NumberLong(66)
},
"appliedOpTime" : {
"ts" : Timestamp(1559840946, 1),
"t" : NumberLong(66)
},
"durableOpTime" : {
"ts" : Timestamp(1559840946, 1),
"t" : NumberLong(66)
}
},
"members" : [
{
"_id" : 0,
"name" : "1b:27018",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 177202,
"optime" : {
"ts" : Timestamp(1559840936, 1),
"t" : NumberLong(66)
},
"optimeDurable" : {
"ts" : Timestamp(1559840936, 1),
"t" : NumberLong(66)
},
"optimeDate" : ISODate("2019-06-06T17:08:56Z"),
"optimeDurableDate" : ISODate("2019-06-06T17:08:56Z"),
"lastHeartbeat" : ISODate("2019-06-06T17:09:05.722Z"),
"lastHeartbeatRecv" : ISODate("2019-06-06T17:09:05.263Z"),
"pingMs" : NumberLong(0),
"lastHeartbeatMessage" : "",
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"infoMessage" : "",
"electionTime" : Timestamp(1559662230, 1),
"electionDate" : ISODate("2019-06-04T15:30:30Z"),
"configVersion" : 2
},
{
"_id" : 1,
"name" : "1d:27018",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 177203,
"optime" : {
"ts" : Timestamp(1559840946, 1),
"t" : NumberLong(66)
},
"optimeDate" : ISODate("2019-06-06T17:09:06Z"),
"syncingTo" : "1b:27018",
"syncSourceHost" : "1b:27018",
"syncSourceId" : 0,
"infoMessage" : "",
"configVersion" : 2,
"self" : true,
"lastHeartbeatMessage" : ""
},
{
"_id" : 2,
"name" : "1c:27018",
"health" : 1,
"state" : 7,
"stateStr" : "ARBITER",
"uptime" : 177202,
"lastHeartbeat" : ISODate("2019-06-06T17:09:05.933Z"),
"lastHeartbeatRecv" : ISODate("2019-06-06T17:09:05.468Z"),
"pingMs" : NumberLong(1),
"lastHeartbeatMessage" : "",
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"infoMessage" : "",
"configVersion" : 2
}
],
"ok" : 1,
"operationTime" : Timestamp(1559840946, 1),
"$gleStats" : {
"lastOpTime" : Timestamp(0, 0),
"electionId" : ObjectId("000000000000000000000000")
},
"lastCommittedOpTime" : Timestamp(1559840946, 1),
"$configServerState" : {
"opTime" : {
"ts" : Timestamp(1559840926, 1),
"t" : NumberLong(35)
}
},
"$clusterTime" : {
"clusterTime" : Timestamp(1559840946, 1),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
Adding rs.conf():
{
"_id" : "shardRS1",
"version" : 2,
"protocolVersion" : NumberLong(1),
"writeConcernMajorityJournalDefault" : true,
"members" : [
{
"_id" : 0,
"host" : "1b:27018",
"arbiterOnly" : false,
"buildIndexes" : true,
"hidden" : false,
"priority" : 1,
"tags" : {
},
"slaveDelay" : NumberLong(0),
"votes" : 1
},
{
"_id" : 1,
"host" : "1d:27018",
"arbiterOnly" : false,
"buildIndexes" : true,
"hidden" : false,
"priority" : 1,
"tags" : {
},
"slaveDelay" : NumberLong(0),
"votes" : 1
},
{
"_id" : 2,
"host" : "1c:27018",
"arbiterOnly" : true,
"buildIndexes" : true,
"hidden" : false,
"priority" : 0,
"tags" : {
},
"slaveDelay" : NumberLong(0),
"votes" : 1
}
],
"settings" : {
"chainingAllowed" : true,
"heartbeatIntervalMillis" : 2000,
"heartbeatTimeoutSecs" : 10,
"electionTimeoutMillis" : 10000,
"catchUpTimeoutMillis" : -1,
"catchUpTakeoverDelayMillis" : 30000,
"getLastErrorModes" : {
},
"getLastErrorDefaults" : {
"w" : 1,
"wtimeout" : 0
},
"replicaSetId" : ObjectId("5c8743e6b6916b268c071e3a")
}
}
db.serverCmdLineOpts().parsed. – Stennie Jun 01 '19 at 22:35version()anddb.version()in themongoshell? – Stennie Jun 01 '19 at 22:35w:2orw:majoritywrite concern would throttle writes to keep up with replication, but it's not clear if that is the actual reason for the growth of your lookaside file. Is there significant replication lag (which could cause cache pressure)? Can you edit your question to include the output ofrs.status()andrs.conf()? Have you changed any other server options from the default? How much RAM do your instances have? What specific O/S and version are you using? Are these bare metal instances, VMs, containers, ... ? I know that's a lot of questions, but this should be solvable. – Stennie Jun 04 '19 at 08:10rs.status()output wasn't helpful since the secondary was re-syncing at the time, but I'm curious about replication lag. If an increased write concern resolved your issue, it seems likely that the primary was getting ahead of your secondary and data was being pinned in the primary cache waiting for replication. Once reads start dipping into disk the performance will suffer, but a 100GB lookaside is unexpectedly large. Note: the default write concern isw:1(notw:majority). Increasing the size of the WT cache may help somewhat but isn't likely to resolve this. – Stennie Jun 06 '19 at 07:36serviceExecutorflag (which is still experimental as at MongoDB 4.0). Some workloads had performance issues with the adaptive serviceExecutor so we haven't been confident to make this the default execution model yet. I didn't find any relevant open issues but it would be good to remove any potential impact since your problem seems easily reproducible. – Stennie Jun 06 '19 at 07:36