20:59:50 #startmeeting Development IRC meeting 25 September 2019 20:59:50 Meeting started Wed Sep 25 20:59:50 2019 UTC. The chair is ashimema. Information about MeetBot at http://wiki.debian.org/MeetBot. 20:59:50 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 20:59:50 The meeting name has been set to 'development_irc_meeting_25_september_2019' 20:59:52 Will be back on my desktop soon, on mobile phone at the moment... 21:00:02 #topic Introductions 21:00:13 #info Martin Renvoize, PTFS Europe 21:00:25 * ashimema is aiming for a short one tonight 21:00:29 #info David Nind, NZ 21:01:01 #info Thomas Dukleth, Agogme, New York City - no more jury duty for a few years again 21:03:57 #topic Announcements 21:05:04 #info The 'adventurous pushing' window is now closed.. though I will contemplate a few more during Marseile hackfest. Lets move into the polishing phase. 21:05:31 #info Marseille Hackfest is upon us... 30 Sept - 4th Oct 21:05:41 anyone have anything else? 21:05:49 #info Katrin Fischer, BSZ, Germany 21:06:04 #info Paul Hoffman, Fenway Library Organization, USA 21:06:06 yeay :) 21:06:18 tcohen around? 21:06:22 rmaints 21:06:32 #topic Update from the Release manager (19.11) 21:08:01 #info As above, we're now moving into the polishing phase of the cycle.. there's still time for a few more minor enhancements but generally I would like the team to concentrate on polishing what already pushed and bugfixes. 21:08:27 #topic Updates from the Release Maintainers 21:08:31 I think some bigger things just hit the QA queue, but I am bit behind 21:08:48 maybe we should do some triage work next week 21:08:54 #info The latest releases have all gone out, well done RMaints and Team. 21:09:09 #topic Updates from the QA team 21:09:22 it's been a bit of a quiet summer 21:09:43 let me take a quick look 21:09:50 #info It's been a quiet summer, but some bigger bugs have recently hit the QA queue and need attention. 21:10:08 #info cait and ashimema will work on some triage next week at Marseille hackfest. 21:10:13 26 majors/criticals that need attention, in different stages 21:10:17 anything else cait 21:10:23 most are new/nso, so work for everyone there 21:10:27 * ashimema has a headache coming on so is rushing through a bit ;) 21:10:30 77 in the QA queue 21:10:43 i hope to have a little session next week with the attending parts of QA team to discuss a few of them together 21:11:09 #info QA queue: 26 Major/Critical, 77 Total 21:11:35 not 26 in the qa queue... but i would like the to be right now 21:12:01 oops 21:12:06 my bad 21:12:21 moving on?.. 21:12:24 yep 21:12:42 #topic General development discussion (trends, ideas, ...) 21:13:45 #topic Should we allow Koha::Object methods that normally return Koha::Object or Koha::Objects type object to ever return 'undef'? 21:14:18 is there a best practice there we could follow? 21:14:26 So.. whilst QAing bug 23272 I noticed that we're starting to develop a pattern bad practice. 21:14:26 04Bug http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23272 enhancement, P5 - low, ---, tomascohen, RESOLVED FIXED, Koha::AuthorisedValue should use Koha::Object::Limit::Library 21:15:13 we could either fall to QAers to make sure any calls to such methods are always of a ternary form 21:15:32 hm 21:15:45 if you get an empty object - no checks required? 21:16:31 $thing = defined($result->link_method) ? $result->link_method->related_method : undef; 21:17:04 or... we could always return an object, object set and as such do away with the need for the defined check 21:18:02 I mostly wanted to get some opinions on it.. before really suggesting a coding guideline 21:18:13 it sounds like the nicer way, but not sure i have enough knowledge to be helpful here :) 21:18:30 it's mostly a matter of style.. but I've certinly seen a number of cases lately where the defined check was missed and as such we ended up with bugs 21:18:42 bugs are ba 21:18:42 d 21:18:54 Undefined may have a logical status which may be distinct from an empty set value, however, it seems to be too often used without any logical distinction between undefined and empty set. 21:19:31 I tihnk in this case there is not really a difference between empty and undef 21:20:18 ashimema: if we have checks for 'number of objects returned' could that be a problem? 21:20:24 I think when the return is expected to be a set rather than a singular then we should always return a set (even if empty).. but for the cases where a singular is expected then undef checks are more sane 21:20:30 empty could still be counted? 21:20:45 yup... 21:20:52 Undefined values provoke bugs where the code is not expecting undefined. I personally find seeing undefined to be ugly, difficult to read in some context, and seldom necessary. 21:21:25 thd++ 21:22:28 is this covered in the devlopment guide? looks like it should be if it isn't already 21:22:42 well.. it sounds like there's interest in such guidance so I'll try to come up with some words and we can continue discussion next meeting. 21:23:01 nope.. not yet covered davidnind[m] 21:23:37 lets jump on 21:23:56 #topic What's the best way to limit the amount of memory consumed when running a guided report? 21:24:00 I am trying to keep additional undefined values out of wiki migration where I can. Undefined values might not be avoidable for some columns. 21:24:23 #info Bug 23626 21:24:23 04Bug http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23626 major, P2, ---, paul, Signed Off , Add a system preference to limit the number of rows of data used in a chart when viewing report results 21:24:31 that's my bug 21:24:46 we've had several outages and service degradations recently due to this 21:25:02 This one was only braught to my attention about an hour ago.. not had a great deal of time to look into it.. 21:25:13 the charting feature in guided_reports.pl consumes unbounded memory -- all data is loaded into memory, converted to JSON (so then 2 copies in mem) and only then sent back to browser 21:25:26 can you summarise the problem nkuitse 21:25:30 one report recently led to (I think) 5+ GB above and beyond normal memory consumption 21:26:04 the problem is in guided_reports.pl 21:26:33 while (my $row = $sth2->fetchrow_arrayref()) { 21:26:35 my @cells = map { +{ cell => $_ } } @$row; 21:26:41 push @allrows, { cells => \@cells }; 21:26:44 } 21:26:57 then it passes \@allrows to the TT2 template 21:27:22 so that the charting javascript can (if the user ends up charting) use all results 21:27:57 the full results aren't needed unless the user ends up making a chart 21:28:17 but guided_reports.pl has no way of knowing whether the user will do that 21:28:35 (and only if they check the box that says "use all rows" or whatever when creating the chart!) 21:28:40 that's the long and short of it 21:28:45 so it's only for guided reports? 21:28:48 yes 21:29:03 saved sql too? 21:29:11 i think they share templates 21:29:19 aren't they the same thing? 21:29:25 * ashimema can't remember off the top of his head where/when graphs become available 21:29:31 19.05 21:29:34 um sorry no 21:29:38 no, we're running 18.11 21:29:39 18.11 have them too 21:29:41 yes 21:30:08 nkuitse: there is the guided reports step by step thing - and sql - but when you run them it's the same yes 21:30:11 sorry.. I didn't mean version.. I meant what reports allow graphs.. whether it's anything you build of just limited subset 21:30:18 anything at all 21:30:58 does that mean there is someting in the other report-related Perl scripts that stops them taking over the system resources? 21:30:59 the big one that first brought this to our attention was a report of all items at a library 21:31:21 speaking from a point of ignorance about all things Perl :) 21:31:31 i'm not sure -- i haven't checked, or at least i don't think i have 21:31:32 My gut feeling is that limiting the return by default is incorrect as it leads to meaningless output 21:32:03 i tihnk we need some message/tooltip in the gui 21:32:12 to me a requiest to build a graph should be handled distinctly and with a warning of possible memory issues.. 21:32:27 the patches in the big report default to unlimited 21:32:36 I've not had a chance to look at your patch yet so I may be misunderstanding your proposed solution 21:32:47 i tihnk it adds a pref 21:32:57 a new sys pref that you can set to limit the number of rows 21:33:23 so, for example, if you feel pretty good about allowing 10,000 rows then you set it at 10,000 21:33:33 if you never have memory issues then you leave it unset, i.e., unlimited 21:33:34 Would it not help to introduce a slight delay to first query the size of the result set and if it would be over some large number of rows then prompt with an altered query form with some restriction set to avoid thrashing the system along with a warning. 21:33:46 but it would be incredibly report dependant 21:33:56 the query for the result set size was bug 23624 21:33:56 04Bug http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23624 normal, P5 - low, ---, paul, Passed QA , Count rows in report without (potentially) consuming all memory 21:34:01 10,000 rows for a report that only returns two fields would be low memory.. 21:34:17 there's a very nice fix for 23624 -- very clean 21:34:23 10,000 rows for a report that returns 50 fields is obviously not 21:35:23 for us, the problem is the outages -- our staff need to run reports but we've had to tell them to hold off on anything that might generate large results 21:35:41 but how do they know if a report will generate large results? -- they're not DB folks! 21:35:43 Is there any efficient way to test quickly for the overall size of the returned data before actually returning it? 21:36:00 not that i know of, but we could certainly use some heuristics 21:36:19 like $nrows * $ncols * 100 or something 21:36:23 could we include the pref setting int the gui? 21:36:33 as a hint? 21:36:45 please note... only a maximum of x rows will be processed or something like that 21:36:50 you mean a maximum that the user can set when running the report? 21:36:56 oh, sorry, i misunderstood 21:37:19 yes, some recent tweaking in the work on the bug does just that 21:37:43 but keep in mind that the limit would not affect browsing through report results 21:37:51 because that uses LIMIT and OFFSET 21:38:05 a hint on the graph dialog maybe 21:38:17 i just want to avoid people not being aware of limits 21:38:49 yeah 21:39:02 that's exactly what the latest rev of the patch does: 21:39:12 cool 21:39:19 oh good.. Tomas already did the followup I would have proposed in bug 23624 21:39:19 04Bug http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23624 normal, P5 - low, ---, paul, Passed QA , Count rows in report without (potentially) consuming all memory 21:39:19 https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=92994 21:39:59 the wording is "Include first n rows of full report (ignore pagination)" -- that's the label for the checkbox 21:40:10 so it's pretty clear 21:40:47 hm yes - wonder if we shuld mention the pref, but too much text is not good either 21:41:31 i'm fine with wording tweaks; i'm lobbying for a quick fix now and then improve on it later 21:41:40 as long as we don't paint ourselves into a corner in the process 21:42:13 nkuitse: it was me saying on the bug that maintenance releases just happened 21:42:26 so it will be a little while 21:42:31 next month 21:42:58 if it's that long then we'll patch it on our server and reverse it before we upgrade 21:43:20 it's a pain, because bywater is managing our instance (we host it), but i don't want to wait much longer 21:44:12 i don't know how many people have the requisite permisions to create reports, but i'm dreading a bad query that's missing a JOIN 21:44:18 we've got ~2.2 million items 21:44:25 yes, that's a lot 21:44:27 it's not an imediate security issue so it won't get a special release. 21:44:33 888,000 bibs 21:44:42 so not before 22 October in reality 21:44:57 right, DOS only -- though we've had some data loss (recovered from recent backup) 21:45:01 personally.. we just train all customers to take care when running reports 21:45:24 could using something like metabase or Urungi in the meantime take the load of your server for reports (I haven't tried either of these, and not familiar with them except what I saw at KohaCon) 21:45:45 i'm woefully ignorant of fancy reporting tools 21:46:01 i just head straight for mysql, i.e., from command line 21:46:15 I think I'd prefer an appraoch that simply didn't return the all array untill requested 21:46:17 the sql reports are mighty, but they bare a bit of risk that's true 21:46:32 we have had some libraries bring down their instance with reports pre-graphs 21:46:33 and our users like reports in Koha, when we haven't told them to stop running them that is ;-) 21:46:37 I don't see how a graph that didn't contain all rows would be useful to the end user 21:46:44 at the same time I understand your worries 21:47:02 is there a way in Perl to limit what resources a script uses? something like nice that you can use on the command line (speaking from a point of ignorance about system admin + Perl) 21:47:05 the problem is, imho, that the graphing should be done on reports with a GROUP BY 21:47:29 rather than reading in a gazillion rows of raw data and then compiling that into the pretty pie chart the user's after 21:47:35 sorry, thinking out loud here :) 21:47:54 i don't know of a memnice kind of thing -- ionice might help, i suppose 21:47:54 I'm not sure how that would work nkuitse.. 21:48:10 exactly -- i agree 21:48:13 you can write pretty much any arbitrary read only sql in our reports system.. 21:48:22 yep 21:48:27 so how would we programatically be able to add in a GROUP BY ? 21:48:34 but not knowing what you're doing can bring down the whole server 21:48:45 i'm not suggesting munging in a GROUP BY 21:48:48 again.. I may be missing something.. it's been a while since I worked in this particular area 21:49:30 just pointing out that charting is showing us that some reports make sense to chart and others don't 21:49:44 totally 21:49:51 but you can still try to chart the ones that don't make sense 21:50:08 and Koha has no idea which is which 21:50:33 what i really wanted to do -- no offense to anyone who worked on the charting implementation -- was to remove charting altogether 21:50:54 since there are other dedicated tools for visualizing data -- "do one thing well"! 21:51:05 what would make a lot more sense 21:51:09 sorr, got to leave 21:51:12 is to add a syspref, disable charts 21:51:15 's ok 21:51:21 ooh, i like that 21:51:24 then for your use case, when you can't trust your users, you can disable charts 21:51:27 I like that appraoch rangi 21:51:32 very clean, very easy 21:51:33 then those who can, can leave it on 21:51:52 charting should be on by default so as not to surprise people 21:51:59 genius! 21:52:02 :-) 21:52:38 i can take a look at the code and come up with a proposed patch 21:52:47 do you think it would make sense to start a new ticket for it? 21:52:53 that would be great 21:52:55 i don't want to muddy the waters, so to speak 21:53:07 [off] rangi says nothing the whole meeting and swoops in with simple yet effective solution from nowhere :) 21:53:12 yeah, create a new bug and perhaps add the original into the 'see also' 21:53:16 hah! :-) 21:53:22 that way you can see where the thinking came from 21:53:25 ok, yeah, i'll do that ashimema 21:53:45 thanks nkuitse 21:54:03 i'm happy with this -- still going to patch something but i like the long-term solution a lot better 21:54:08 thanks again rangi 21:54:14 no worries 21:54:31 bye all -- i've got to eat dinner -- thanks for helping with this! 21:54:39 #info nkuitse is going to subit a new bug with the alternative of making the graphing availabilty syspref controlled. 21:54:50 thanks :) 21:55:17 brill.. next topic 21:55:46 #topic API Proposal for related resources (HAL) 21:55:49 Not sure about anyone else, but I've not had a chance to get myself up to speed on this one.. 21:56:14 and without tcohen here to promote it I think we should postpone discussion untill the next meeting. 21:57:01 The link to the former proposal is missing, the only link is to the HAL documentation. 21:57:50 I'll try and sort that and put it in the next agenda.. 21:58:14 so.. moving on 21:58:26 #topic Review of coding guidelines 21:58:31 #info Nothing to report 21:58:44 #topic Set time of next meeting 22:00:55 9th Oct would follow the usual pattern 22:01:29 14UTC for the earlier slow 22:01:37 any objections/requests? 22:02:08 +1 22:02:55 #info Next meeting: 09 October 2019, 14 UTC 22:03:14 #endmeeting