20:59:50 <ashimema> #startmeeting Development IRC meeting 25 September 2019
20:59:50 <huginn> Meeting started Wed Sep 25 20:59:50 2019 UTC.  The chair is ashimema. Information about MeetBot at http://wiki.debian.org/MeetBot.
20:59:50 <huginn> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
20:59:50 <huginn> The meeting name has been set to 'development_irc_meeting_25_september_2019'
20:59:52 <davidnind[m]> Will be back on my desktop soon, on mobile phone at the moment...
21:00:02 <ashimema> #topic Introductions
21:00:13 <ashimema> #info Martin Renvoize, PTFS Europe
21:00:25 * ashimema is aiming for a short one tonight
21:00:29 <davidnind[m]> #info David Nind, NZ
21:01:01 <thd> #info Thomas Dukleth, Agogme, New York City - no more jury duty for a few years again
21:03:57 <ashimema> #topic Announcements
21:05:04 <ashimema> #info The 'adventurous pushing' window is now closed.. though I will contemplate a few more during Marseile hackfest.  Lets move into the polishing phase.
21:05:31 <ashimema> #info Marseille Hackfest is upon us... 30 Sept - 4th Oct
21:05:41 <ashimema> anyone have anything else?
21:05:49 <cait> #info Katrin Fischer, BSZ, Germany
21:06:04 <nkuitse> #info Paul Hoffman, Fenway Library Organization, USA
21:06:06 <ashimema> yeay :)
21:06:18 <ashimema> tcohen around?
21:06:22 <ashimema> rmaints
21:06:32 <ashimema> #topic  Update from the Release manager (19.11)
21:08:01 <ashimema> #info As above, we're now moving into the polishing phase of the cycle.. there's still time for a few more minor enhancements but generally I would like the team to concentrate on polishing what already pushed and bugfixes.
21:08:27 <ashimema> #topic Updates from the Release Maintainers
21:08:31 <cait> I think some bigger things just hit the QA queue, but I am bit behind
21:08:48 <cait> maybe we should do some triage work next week
21:08:54 <ashimema> #info The latest releases have all gone out, well done RMaints and Team.
21:09:09 <ashimema> #topic Updates from the QA team
21:09:22 <cait> it's been a bit of a quiet summer
21:09:43 <cait> let me take a quick look
21:09:50 <ashimema> #info It's been a quiet summer, but some bigger bugs have recently hit the QA queue and need attention.
21:10:08 <ashimema> #info cait and ashimema will work on some triage next week at Marseille hackfest.
21:10:13 <cait> 26 majors/criticals that need attention, in different stages
21:10:17 <ashimema> anything else cait
21:10:23 <cait> most are new/nso, so work for everyone there
21:10:27 * ashimema has a headache coming on so is rushing through a bit ;)
21:10:30 <cait> 77 in the QA queue
21:10:43 <cait> i hope to have a little session next week with the attending parts of QA team to discuss a few of them together
21:11:09 <ashimema> #info QA queue: 26 Major/Critical, 77 Total
21:11:35 <cait> not 26 in the qa queue... but i would like the to be right now
21:12:01 <ashimema> oops
21:12:06 <ashimema> my bad
21:12:21 <ashimema> moving on?..
21:12:24 <cait> yep
21:12:42 <ashimema> #topic General development discussion (trends, ideas, ...)
21:13:45 <ashimema> #topic Should we allow Koha::Object methods that normally return Koha::Object or Koha::Objects type object to ever return 'undef'?
21:14:18 <cait> is there a best practice there we could follow?
21:14:26 <ashimema> So.. whilst QAing bug 23272 I noticed that we're starting to develop a pattern bad practice.
21:14:26 <huginn> 04Bug http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23272 enhancement, P5 - low, ---, tomascohen, RESOLVED FIXED, Koha::AuthorisedValue should use Koha::Object::Limit::Library
21:15:13 <ashimema> we could either fall to QAers to make sure any calls to such methods are always of a ternary form
21:15:32 <cait> hm
21:15:45 <cait> if you get an empty object - no checks required?
21:16:31 <ashimema> $thing = defined($result->link_method) ? $result->link_method->related_method : undef;
21:17:04 <ashimema> or... we could always return an object, object set and as such do away with the need for the defined check
21:18:02 <ashimema> I mostly wanted to get some opinions on it.. before really suggesting a coding guideline
21:18:13 <cait> it sounds like the nicer way, but not sure i have enough knowledge to be helpful here :)
21:18:30 <ashimema> it's mostly a matter of style.. but I've certinly seen a number of cases lately where the defined check was missed and as such we ended up with bugs
21:18:42 <cait> bugs are ba
21:18:42 <cait> d
21:18:54 <thd> Undefined may have a logical status which may be distinct from an empty set value, however, it seems to be too often used without any logical distinction between undefined and empty set.
21:19:31 <cait> I tihnk in this case there is not really a difference between empty and undef
21:20:18 <cait> ashimema: if we have checks for 'number of objects returned' could that be a problem?
21:20:24 <ashimema> I think when the return is expected to be a set rather than a singular then we should always return a set (even if empty).. but for the cases where a singular is expected then undef checks are more sane
21:20:30 <cait> empty could still be counted?
21:20:45 <ashimema> yup...
21:20:52 <thd> Undefined values provoke bugs where the code is not expecting undefined.  I personally find seeing undefined to be ugly, difficult to read in some context, and seldom necessary.
21:21:25 <ashimema> thd++
21:22:28 <davidnind[m]> is this covered in the devlopment guide? looks like it should be if it isn't already
21:22:42 <ashimema> well.. it sounds like there's interest in such guidance so I'll try to come up with some words and we can continue discussion next meeting.
21:23:01 <ashimema> nope.. not yet covered davidnind[m]
21:23:37 <ashimema> lets jump on
21:23:56 <ashimema> #topic What's the best way to limit the amount of memory consumed when running a guided report?
21:24:00 <thd> I am trying to keep additional undefined values out of wiki migration where I can.  Undefined values might not be avoidable for some columns.
21:24:23 <ashimema> #info Bug 23626
21:24:23 <huginn> 04Bug http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23626 major, P2, ---, paul, Signed Off , Add a system preference to limit the number of rows of data used in a chart when viewing report results
21:24:31 <nkuitse> that's my bug
21:24:46 <nkuitse> we've had several outages and service degradations recently due to this
21:25:02 <ashimema> This one was only braught to my attention about an hour ago.. not had a great deal of time to look into it..
21:25:13 <nkuitse> the charting feature in guided_reports.pl consumes unbounded memory -- all data is loaded into memory, converted to JSON (so then 2 copies in mem) and only then sent back to browser
21:25:26 <ashimema> can you summarise the problem nkuitse
21:25:30 <nkuitse> one report recently led to (I think) 5+ GB above and beyond normal memory consumption
21:26:04 <nkuitse> the problem is in guided_reports.pl
21:26:33 <nkuitse> while (my $row = $sth2->fetchrow_arrayref()) {
21:26:35 <nkuitse> my @cells = map { +{ cell => $_ } } @$row;
21:26:41 <nkuitse> push @allrows, { cells => \@cells };
21:26:44 <nkuitse> }
21:26:57 <nkuitse> then it passes \@allrows to the TT2 template
21:27:22 <nkuitse> so that the charting javascript can (if the user ends up charting) use all results
21:27:57 <nkuitse> the full results aren't needed unless the user ends up making a chart
21:28:17 <nkuitse> but guided_reports.pl has no way of knowing whether the user will do that
21:28:35 <nkuitse> (and only if they check the box that says "use all rows" or whatever when creating the chart!)
21:28:40 <nkuitse> that's the long and short of it
21:28:45 <ashimema> so it's only for guided reports?
21:28:48 <nkuitse> yes
21:29:03 <cait> saved sql too?
21:29:11 <cait> i think they share templates
21:29:19 <nkuitse> aren't they the same thing?
21:29:25 * ashimema can't remember off the top of his head where/when graphs become available
21:29:31 <cait> 19.05
21:29:34 <cait> um sorry no
21:29:38 <nkuitse> no, we're running 18.11
21:29:39 <cait> 18.11 have them too
21:29:41 <cait> yes
21:30:08 <cait> nkuitse: there is the guided reports step by step thing - and sql - but when you run them it's the same yes
21:30:11 <ashimema> sorry.. I didn't mean version.. I meant what reports allow graphs.. whether it's anything you build of just  limited subset
21:30:18 <nkuitse> anything at all
21:30:58 <davidnind[m]> does that mean there is someting in the other report-related Perl scripts that stops them taking over the system resources?
21:30:59 <nkuitse> the big one that first brought this to our attention was a report of all items at a library
21:31:21 <davidnind[m]> speaking from a point of ignorance about all things Perl :)
21:31:31 <nkuitse> i'm not sure -- i haven't checked, or at least i don't think i have
21:31:32 <ashimema> My gut feeling is that limiting the return by default is incorrect as it leads to meaningless output
21:32:03 <cait> i tihnk we need some message/tooltip in the gui
21:32:12 <ashimema> to me a requiest to build a graph should be handled distinctly and with a warning of possible memory issues..
21:32:27 <nkuitse> the patches in the big report default to unlimited
21:32:36 <ashimema> I've not had a chance to look at your patch yet so I may be misunderstanding your proposed solution
21:32:47 <cait> i tihnk it adds a pref
21:32:57 <nkuitse> a new sys pref that you can set to limit the number of rows
21:33:23 <nkuitse> so, for example, if you feel pretty good about allowing 10,000 rows then you set it at 10,000
21:33:33 <nkuitse> if you never have memory issues then you leave it unset, i.e., unlimited
21:33:34 <thd> Would it not help to introduce a slight delay to first query the size of the result set and if it would be over some large number of rows then prompt with an altered query form with some restriction set to avoid thrashing the system along with a warning.
21:33:46 <ashimema> but it would be incredibly report dependant
21:33:56 <nkuitse> the query for the result set size was bug 23624
21:33:56 <huginn> 04Bug http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23624 normal, P5 - low, ---, paul, Passed QA , Count rows in report without (potentially) consuming all memory
21:34:01 <ashimema> 10,000 rows for a report that only returns two fields would be low memory..
21:34:17 <nkuitse> there's a very nice fix for 23624 -- very clean
21:34:23 <ashimema> 10,000 rows for a report that returns 50 fields is obviously not
21:35:23 <nkuitse> for us, the problem is the outages -- our staff need to run reports but we've had to tell them to hold off on anything that might generate large results
21:35:41 <nkuitse> but how do they know if a report will generate large results? -- they're not DB folks!
21:35:43 <thd> Is there any efficient way to test quickly for the overall size of the returned data before actually returning it?
21:36:00 <nkuitse> not that i know of, but we could certainly use some heuristics
21:36:19 <nkuitse> like $nrows * $ncols * 100 or something
21:36:23 <cait> could we include the pref setting int the gui?
21:36:33 <cait> as a hint?
21:36:45 <cait> please note... only a maximum of x rows will be processed or something like that
21:36:50 <nkuitse> you mean a maximum that the user can set when running the report?
21:36:56 <nkuitse> oh, sorry, i misunderstood
21:37:19 <nkuitse> yes, some recent tweaking in the work on the bug does just that
21:37:43 <nkuitse> but keep in mind that the limit would not affect browsing through report results
21:37:51 <nkuitse> because that uses LIMIT and OFFSET
21:38:05 <cait> a hint on the graph dialog maybe
21:38:17 <cait> i just want to avoid people not being aware of limits
21:38:49 <nkuitse> yeah
21:39:02 <nkuitse> that's exactly what the latest rev of the patch does:
21:39:12 <cait> cool
21:39:19 <ashimema> oh good.. Tomas already did the followup I would have proposed in bug 23624
21:39:19 <huginn> 04Bug http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23624 normal, P5 - low, ---, paul, Passed QA , Count rows in report without (potentially) consuming all memory
21:39:19 <nkuitse> https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=92994
21:39:59 <nkuitse> the wording is "Include first n rows of full report (ignore pagination)" -- that's the label for the checkbox
21:40:10 <nkuitse> so it's pretty clear
21:40:47 <cait> hm yes - wonder if we shuld mention the pref, but too much text is not good either
21:41:31 <nkuitse> i'm fine with wording tweaks; i'm lobbying for a quick fix now and then improve on it later
21:41:40 <nkuitse> as long as we don't paint ourselves into a corner in the process
21:42:13 <cait> nkuitse: it was me saying on the bug that maintenance releases just happened
21:42:26 <cait> so it will be a little while
21:42:31 <cait> next month
21:42:58 <nkuitse> if it's that long then we'll patch it on our server and reverse it before we upgrade
21:43:20 <nkuitse> it's a pain, because bywater is managing our instance (we host it), but i don't want to wait much longer
21:44:12 <nkuitse> i don't know how many people have the requisite permisions to create reports, but i'm dreading a bad query that's missing a JOIN
21:44:18 <nkuitse> we've got ~2.2 million items
21:44:25 <cait> yes, that's a lot
21:44:27 <ashimema> it's not an imediate security issue so it won't get a special release.
21:44:33 <nkuitse> 888,000 bibs
21:44:42 <ashimema> so not before 22 October in reality
21:44:57 <nkuitse> right, DOS only -- though we've had some data loss (recovered from recent backup)
21:45:01 <ashimema> personally.. we just train all customers to take care when running reports
21:45:24 <davidnind[m]> could using something like metabase or Urungi in the meantime take the load of your server for reports (I haven't tried either of these, and not familiar with them except what I saw at KohaCon)
21:45:45 <nkuitse> i'm woefully ignorant of fancy reporting tools
21:46:01 <nkuitse> i just head straight for mysql, i.e., from command line
21:46:15 <ashimema> I think I'd prefer an appraoch that simply didn't return the all array untill requested
21:46:17 <cait> the sql reports are mighty, but they bare a bit of risk that's true
21:46:32 <cait> we have had some libraries bring down their instance with reports pre-graphs
21:46:33 <nkuitse> and our users like reports in Koha, when we haven't told them to stop running them that is ;-)
21:46:37 <ashimema> I don't see how a graph that didn't contain all rows would be useful to the end user
21:46:44 <ashimema> at the same time I understand your worries
21:47:02 <davidnind[m]> is there a way in Perl to limit what resources a script uses? something like nice that you can use on the command line (speaking from a point of ignorance about system admin + Perl)
21:47:05 <nkuitse> the problem is, imho, that the graphing should be done on reports with a GROUP BY
21:47:29 <nkuitse> rather than reading in a gazillion rows of raw data and then compiling that into the pretty pie chart the user's after
21:47:35 <davidnind[m]> sorry, thinking out loud here :)
21:47:54 <nkuitse> i don't know of a memnice kind of thing -- ionice might help, i suppose
21:47:54 <ashimema> I'm not sure how that would work nkuitse..
21:48:10 <nkuitse> exactly -- i agree
21:48:13 <ashimema> you can write pretty much any arbitrary read only sql in our reports system..
21:48:22 <nkuitse> yep
21:48:27 <ashimema> so how would we programatically be able to add in a GROUP BY ?
21:48:34 <nkuitse> but not knowing what you're doing can bring down the whole server
21:48:45 <nkuitse> i'm not suggesting munging in a GROUP BY
21:48:48 <ashimema> again.. I may be missing something.. it's been a while since I worked in this particular area
21:49:30 <nkuitse> just pointing out that charting is showing us that some reports make sense to chart and others don't
21:49:44 <ashimema> totally
21:49:51 <nkuitse> but you can still try to chart the ones that don't make sense
21:50:08 <nkuitse> and Koha has no idea which is which
21:50:33 <nkuitse> what i really wanted to do -- no offense to anyone who worked on the charting implementation -- was to remove charting altogether
21:50:54 <nkuitse> since there are other dedicated tools for visualizing data -- "do one thing well"!
21:51:05 <rangi> what would make a lot more sense
21:51:09 <cait> sorr, got to leave
21:51:12 <rangi> is to add a syspref, disable charts
21:51:15 <nkuitse> 's ok
21:51:21 <nkuitse> ooh, i like that
21:51:24 <rangi> then for your use case, when you can't trust your users, you can disable charts
21:51:27 <ashimema> I like that appraoch rangi
21:51:32 <nkuitse> very clean, very easy
21:51:33 <rangi> then those who can, can leave it on
21:51:52 <nkuitse> charting should be on by default so as not to surprise people
21:51:59 <nkuitse> genius!
21:52:02 <nkuitse> :-)
21:52:38 <nkuitse> i can take a look at the code and come up with a proposed patch
21:52:47 <nkuitse> do you think it would make sense to start a new ticket for it?
21:52:53 <ashimema> that would be great
21:52:55 <nkuitse> i don't want to muddy the waters, so to speak
21:53:07 <caroline_crazycatlady> [off] rangi says nothing the whole meeting and swoops in with simple yet effective solution from nowhere :)
21:53:12 <ashimema> yeah, create a new bug and perhaps add the original into the 'see also'
21:53:16 <nkuitse> hah!  :-)
21:53:22 <ashimema> that way you can see where the thinking came from
21:53:25 <nkuitse> ok, yeah, i'll do that ashimema
21:53:45 <ashimema> thanks nkuitse
21:54:03 <nkuitse> i'm happy with this -- still going to patch something but i like the long-term solution a lot better
21:54:08 <nkuitse> thanks again rangi
21:54:14 <rangi> no worries
21:54:31 <nkuitse> bye all -- i've got to eat dinner -- thanks for helping with this!
21:54:39 <ashimema> #info nkuitse is going to subit a new bug with the alternative of making the graphing availabilty syspref controlled.
21:54:50 <ashimema> thanks :)
21:55:17 <ashimema> brill.. next topic
21:55:46 <ashimema> #topic API Proposal for related resources (HAL)
21:55:49 <ashimema> Not sure about anyone else, but I've not had a chance to get myself up to speed on this one..
21:56:14 <ashimema> and without tcohen here to promote it I think we should postpone discussion untill the next meeting.
21:57:01 <thd> The link to the former proposal is missing, the only link is to the HAL documentation.
21:57:50 <ashimema> I'll try and sort that and put it in the next agenda..
21:58:14 <ashimema> so.. moving on
21:58:26 <ashimema> #topic Review of coding guidelines
21:58:31 <ashimema> #info Nothing to report
21:58:44 <ashimema> #topic Set time of next meeting
22:00:55 <ashimema> 9th Oct would follow the usual pattern
22:01:29 <ashimema> 14UTC for the earlier slow
22:01:37 <ashimema> any objections/requests?
22:02:08 <thd> +1
22:02:55 <ashimema> #info Next meeting: 09 October 2019, 14 UTC
22:03:14 <ashimema> #endmeeting